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The  presence  of  three  Ag  evolutionary  groupings, 
designated  Apb,  Agd,  and  A^,   has  been  established  on  the 
basis  of  a  restriction  fragment  length  polymorphism 
(RFLP)  analysis  of  the  Ap  genes  of  37  independently 
derived  mouse  H-2  haplotypes.  All  mice  analyzed,  which 
included  laboratory  inbred  and  wild  derived  haplotypes  of 
Mus  musculus  domesticus,  Mus  musculus  musculus,  and  Mus 
musculus  castaneus,  were  found  to  have  an  A3  allele  which 
was  could  be  related  to  one  of  these  three  Ao  groupings. 
No  similar  grouping  in  the  RFLP  analysis  of  the  A„ 
alleles  of  these  same  haplotypes,  however,  was  possible. 
All  of  the  8  t  haplotypes  studied  are  found  to  fall  into 
either  the  Apb  or  Agd  evolutionary  group  based  on  the 
same  RFLP  analysis. 


Restriction  fragments  were  detected  with  a  5.8  kb 
I-Afld  genomic  DNA  probe  and  a  1.2  kb  I-Aab  genomic  DNA 
probe.  The  I-A  alleles  in  each  group  have  50%  or  greater 
restriction  fragments  in  common.  Alleles  in  separate 
groups  share  less  than  20%  of  their  restriction 
fragments. 

The  polymorphism  of  the  A^  and  Aq  genes  as  detected 
by  RFLP  analysis  did  not  always  correlate  with  known 
protein  seguences.  In  an  RFLP  analysis  of  I-Ak  protein 
related  H-2  haplotypes,  5  mouse  strains  were  known  to  be 
closely  related  to  one  another  by  serology  and  tryptic 
peptide  mapping,  were  found  to  fall  into  two  different 
A0  evolutionary  groups.  There  are  also  examples  of  2  Ao 
alleles  being  very  different  at  the  protein  level,  but 
very  similar  when  their  RFLP's  are  compared.  Possible 
explanations  of  these  data  include  the  existence  of 
mechanisms  which  allow  or  promote  gene  conversion  or  some 
form  intragenic  recombination  occurring  in  or  near  the 
introns,  possibly  even  exon  shuffling.  The  presence  of 
three  different  Mus  subspecies  in  one  of  the  A3 
evolutionary  groups  suggests  that  these  groups  arose 
prior  to  the  divergence  of  the  subspecies,  approximately 
one  million  years  ago. 


Vlll 


INTRODUCTION 

The  major  histocompatibility  complex  of  the  mouse, 
known  as  the  H-2  complex,  is  a  cluster  of  loci  encoding 
proteins  whose  function  in  part  includes  the  immune 
defense  of  the  animal  in  its  natural  environment.  There 
are  three  classes  of  genes  in  the  H-2  complex:  the  class 
I  genes  (the  strongest  histocompatibility  genes),  the 
class  II  genes  genes  encoding  proteins  which  are  involved 
in  the  presentation  of  foreign  antigen  to  the  regulatory 
T  lymphocytes  (Benacerraf  1981;  Klein  1975;  Klein  1979), 
and  the  class  III  genes  (which  encode  the  complement 
genes).  Polymorphisms  in  the  class  I  and  class  II  genes, 
i-e-  the  presence  of  multiple  allelic  forms  of  the  gene 
at  freguencies  of  greater  than  1%  in  the  general 
population,  are  thought  to  help  a  species  to  survive 
the  continuous  onslaught  of  environmental  pathogens. 
The  polymorphic  nature  of  the  class  II  A^  and  A3  genes, 
as  related  to  evolutionary  genetic  mechanisms,  is  the 
subject  of  this  dissertation. 

Genetic  and  biochemical  studies  of  class  II 
molecules  have  identified  two  molecules,  designated 
I-A  and  I-E,  which  normally  are  expressed  on  the  surfaces 
of  antigen  presenting  cells  and  B  lymphocytes  (Cullen  et 


al.  1976;  Uhr  et  al.  1979).  The  molecular  cloning  and 
sequencing  of  a  large  portion  of  the  murine  I  region 
has  supplied  extensive  information  on  the  organization 
of  the  murine  class  II  genes  and  the  molecules  they 
encode  (Benoist  et  al.  1983a;  Benoist  et  al.  1983b; 
Choi  et  al^  1983;  Malissen  et  al^_  1983;  Steinmetz  et  al. 
1982).  The  A^,  Ap,  E^,  and  Eg  genes  are  single  copy  genes 
present  on  a  segment  of  about  110  kb  of  DNA  in  the  I 
region  of  the  H-2  complex  (Steinmetz  et  al.  1982).  The 
A^  and  Ap  genes  each  encode  transmembrane  glycoproteins 
which  noncovalently  associate  with  one  another  on  the 
cell  surface  to  form  the  heterodimeric  I-A  molecule. 
Similarly,  the  E^  and  Ep  genes  encode  transmembrane 
glycoproteins  which  form  the  heterodimeric  I-E  molecule. 
Studies  of  the  DNA  sequence  of  these  class  II  genes  have 
established  that  they  all  have  a  common  evolutionary 
origin  and  that  they  represent  one  branch  of  the 
immunoglobulin  supergene  family  (Benoist  et  al.  1983a; 
Choi  et  aJL  1983;  Malissen  et  al^_  1983).  Each  class  II 
gene  consists  of  at  least  6  exons  and  occupies  more  than 
5  kb  of  genomic  DNA.  The  ka,    A3,  and  part  of  the  Ep  genes 
are  located  within  a  portion  of  the  I  region  which 
exhibits  extensive  sequence  polymorphism.  This  region 
extends  5 '  from  a  recombinational  hot  spot  located  within 
the  central  intron  of  Eg  (Steinmetz  et  al.  1984).  The  E^ 
gene  is  located  3 '  to  Eg  in  a  "conserved  tract"  of  the  I 
region,  and  exhibits  much  less  polymorphism  than  A^,  A«, 


or  Eg.  The  evolutionary  mechanisms  responsible  for  the 
production  and  maintenance  of  polymorphic  and  conserved 
tracts  within  the  I  region  are  unknown. 

Previous  studies  on  the  genetic  polymorphisms  of 
class  II  genes  in  wild  mouse  populations  have  provided 
some  insights  into  the  genetic  mechanisms  responsible 
for  their  diversification  (Wakeland  and  Darby  1983; 
Wakeland  and  Klein  1979a;  Wakeland  and  Klein  1979b; 
Wakeland  and  Klein  1983;  Wakeland  et  al^  1985).  Serologic 
and  structural  analyses  of  the  I -A  molecules  expressed 
among  a  collection  of  H-2  haplotypes  derived  from  wild 
mice  led  to  the  definition  of  "families"  of  I-A  alleles 
(Wakeland  and  Klein  1979a;  Wakeland  and  Klein  1983).  The 
I~A  alleles  within  the  same  family  encode  antigenically 
similar  molecules  that  are  identical  in  more  than  90%  of 
their  tryptic  peptides  when  compared  by  high  pressure 
liquid  chromatography  tryptic  peptide  fingerprinting 
(Wakeland  and  Darby  1983;  Wakeland  and  Klein  1979b; 
Wakeland  and  Klein  1983).  The  I-A  molecules  encoded  by 
alleles  in  separate  families  have  distinct  antigenic 
phenotypes  and  are  identical  in  less  than  70%  of  their 
tryptic  peptides.  An  analysis  of  over  40  H-2  haplotypes 
derived  from  laboratory  and  wild  mouse  strains  led  to  the 
definition  of  8  distinct  I-A  families  (Wakeland  and  Klein 
1983). 

This  dissertation  describes  the  RFLP  analysis  of  the 
A^  and  Ag  genes  of  37  standard  laboratory  inbred  and  wild 


mouse  strains  from  the  3  separate  subspecies  Mus  musculus 
domesticus,  Mus  musculus  musculus,  and  Mus  musculus 
castaneus,  and  includes  t  bearing  haplotype  wild  mice. 
Based  on  the  data  generated  using  6  to  7  Restriction 
endonucleases  (RE)  on  each  haplotype  examined  for  An, 
all  of  the  homozygous  haplotypes  examined  could  be  placed 
into  one  of  three  evolutionary  groupings .  Because  the  Ao 
gene  of  subspecies  Mus  musculus  musculus ,  Mus  musculus 
domesticus,  and  Mus  musculus  castaneus  are  in  one  An 
evolutionary  group,  these  evolutionary  groups  probably 
predate  subspeciation  in  the  mouse,  which  occurred 
approximately  1  millon  years  ago.  Since  both  the  Aq 
and  Ap  probes  contain  approximately  80%  noncoding 
seguence  and  in  some  instances  the  protein  seguences 
or  tryptic  peptide  mapping  could  be  compared  between 
haplotypes  being  examined,  the  Ag  evolutionary  groups 
that  I  have  found  appear  to  be  determined  predominantly 
by  noncoding  seguence.  Finally,  the  A^  and  Ap  genes  of 
wild  t  haplotype  bearing  mice  examined  by  RFLP  analysis 
also  fell  into  the  separate  Ag  evolutionary  groups. 


REVIEW  OF  LITERATURE 

The  major  histocompatibility  complex  (MHC)  of  the 
mouse,  also  known  as  the  H-2  complex,  has  been  an  area 
of  fascination  in  modern  genetics  since  its  discovery 
laid  the  groundwork  for  serology  as  a  tool  to  study 
genetics  (Gorer  1936;  Gorer  1938).  The  groundwork  for  the 
discovery  of  the  genetic  basis  of  the  H-2  complex  had 
been  established  by  Little  and  Tyzzer  (1916)  using  inbred 
mouse  lines.  From  the  initial  characterization  of 
erythrocyte  antigens,  through  protein  biochemistry  and 
seguencing,  and  up  to  modern  genetic  engineering,  the 
MHC  has  continued  to  allow  us  to  learn  about  genetics, 
natural  selection,  cellular  physiology,  and  other  far 
reaching  areas  of  biology. 

The  class  II  genes  of  the  H-2  complex  are  located 
in  the  I  region,  so  named  because  that  region  was 
originally  defined  by  the  differential  ability  of  inbred 
mouse  strains  to  mount  an  immune  response  to  certain 
simple  antigens  (Martin  et  al.  1981;  McDevitt  and  Chinitz 
1969;  McDevitt  and  Sela  1965;  Martin  et_;_  al^_  1971).  The 
actual  mapping  of  the  immune  response  genes  within  the 
H-2  complex  was  accomplished  with  the  use  of  inbred 
congenic  and  recombinant  mouse  strains  (Benacerraf  and 


McDevitt  1972;  McDevitt  et  al^  1972),  although  the  actual 
identity  of  these  gene  products  was  in  guestion  for  a 
number  of  years.  The  class  II  genes  encode  proteins  that 
restrict  the  recognition  of  foreign  antigen  by  the 
regulatory  T  lymphocytes  to  those  antigen  presenting 
cells  which  are  of  the  same  allelic  form,  and  are  thus 
critical  for  the  development  of  a  normal  immune  response. 
This  literature  review  will  focus  primarily  on  class  II 
genes  structure  and  possible  functional  correlates  of  the 
genetic  structure. 

Major  Histocompatibility  Complex  Structure 

General  Organization  and  Protein  Structure 

The  murine  major  histocompatibility  complex  (or  H-2 
complex)  encompasses  about  2  centimorgans  of  DNA  which 
may  be  eguivalent  to  as  much  as  2000-4000  kb  of  DNA 
(Hood  et  al^  1982, -Klein  1975)  and  contains  at  least  3 
classes  of  immunologically  related  genes,  denoted  class 
I,  class  II,  and  class  III  (Klein  1975;  Snell  et  al. 
1976).  The  molecules  encoded  by  the  class  I  genes  are  of 
2  general  types.  The  class  I  genes  designated  K,  D,  L, 
and  R  encode  the  classical  transplantation  antigens 
located  on  the  cell  surface  of  most  nucleated  cells  and 
are  known  to  be  very  polymorphic.  These  molecules  are 
primarily  involved  in  the  restricted  recognition  of  some 
viral  antigens  by  cytotoxic  T  lymphocytes  (Zinkernagel 


1979).  The  other  general  type  of  class  I  genes, 
designated  Qa   and  Tla,  are  expressed  on  nucleated  blood 
cells  (Qa.)   or  on  thymocytes  and  certain  leukemias  (Tla) 

(Michaelson  et  al^  1983),  are  much  less  polymorphic,  and 
whose  functions  are  not  yet  known  (Flaherty  1980).  The 
2a  and  Tla  genes  are  located  telomeric  to  the  D,  L,  and 
R  genes  and  number  over  30  (Winoto  et  al.  1983).  The 
molecular  structure  of  both  types  of  class  I  molecules 
consists  of  a  40-45,000  dalton  membrane  bound 
glycoprotein  of  approximately  3  50  amino  acids  which 
form  3  extracellular  domains  of  about  90  amino  acids 
per  domain.  A  fourth  domain  in  the  form  of  32 
microglobulin,  a  12,000  dalton  polypeptide  encoded  on 
chromosome  2  in  the  mouse,  noncovalently  associates  with 
the  class  I  gene  product,  possibly  in  a  stabilizing  role 
(Klein  et  al^_  1983b).  The  relative  locations  of  the  class 
I  genes  to  the  class  II  and  class  III  genes,  as  well  as 
their  general  protein  structure,  are  illustrated  in 
figure  1. 

The  class  III  genes  of  the  H^2  complex  encode 
complement  components  such  as  the  C2,  Bf ,  Sip,  and  C4 
genes.  The  linkage  association  of  some  of  the  complement 
genes  to  the  MHC  of  different  animals  varies  (Alper 
1981).  While  there  has  been  some  argument  for  the 
inclusion  of  the  complement  loci  in  the  MHC  on  the  basis 
of  the  MHC  and  linked  loci  possibly  evolving  as  a  genetic 
unit  (Bodmer  1976),  Klein  et  al^  (1983b)  argue  against  a 
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purely  physical  inclusion  of  the  complement  genes  as  part 
of  the  MHC  in  their  review  article. 

The  class  II  genes  map  between  the  K  and  S  regions 
of  the  H-2  complex  ( figure  1 ) .  There  are  4  functional 
class  II  genes  denoted  A^,  Ag,  J^,  and  Eg,  as  well  as  the 
pseudogenes  Ap3,  A^2,  and  Ep2  (Steinmetz  et  al.  1986; 
Widera  and  Flavell  1985).  The  overall  organization  of 
the  I  region  and  its  protein  products  are  illustrated  in 
figure  1.  The  class  II  glycoproteins  are  made  up  of  a 
28,000  dalton  (3  chain  of  about  230  amino  acids,  and  a 
34,000  dalton  a  chain  of  about  220  amino  acids  (Klein  et 
al.  1983b) .  Both  the  a  and  the  0  chains  consist  of  five 
protein  domains  including  a  hydrophobic  leader  peptide  of 
about  25  amino  acids  absent  in  the  mature  cell-surface 
form  of  the  molecule,  two  approximately  90  amino  acid 
extramembrane  domains  (a^a2  or  P]_32)»  a  hydrophobic 
transmembrane  segment  of  about  25  amino  acids,  and  a 
cytoplasmic  tail  region  containing  a  high  proportion  of 
charged  residues  (Mengle-Gaw  and  McDevitt  1985).  Each  of 
the  a-]_,  a2,  (3-]_,  and  (32  domains  is  formed  by  a  disulfide 
cystine  bridge.  The  two  a  and  two  (3  domains  noncovalently 
associate  to  form  the  I-A  molecule.  The  presence  of 
four  extramembrane  protein  domains  appears  to  be  a 
stabilizing  configuration  for  both  the  class  I  and  class 
II  molecules.  The  structuring  of  the  MHC  protein 
molecules  into  domains  reflects  the  basic  organization 
of  the  encoding  DNA  into  exons  and  introns. 
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Structure  of  the  Class  II  Genes 

Both  the  Ap  and  Eg  genes  are  made  up  of  six  exons, 
one  exon  for  each  of  the  five  protein  domains,  plus  one 
exon  encoding  the  3'  untranslated  region  (Saito  et  al. 
1983).  Both  the  A^  and  E^  genes,  though  very  similar  to 
the  (3  genes ,  are  made  up  of  only  five  exons .  This 
difference  is  due  to  the  transmembrane  and  cytoplasmic 
tail  regions  of  the  a  chains  being  encoded  by  a  single 
exon  (Mathis  et  al^  1983;  McNicholas  et  a^U  1982). 

It  is  important  to  note  the  polymorphic  nature  of 
the  Ap,  A^,  and  Ep  genes  although  they  will  be  discussed 
in  detail  in  a  later  section.  The  presence  of  multiple 
allelic  forms  of  the  Ag  and  Eo  genes  may  imply  a  unique 
role  for  the  encoded  proteins  in  the  ability  of  a 
population  to  respond  to  an  antigen.  The  actual  molecular 
interplay  between  the  class  II  molecule,  the  antigen,  and 
the  T  cell  receptor  is  still  largely  an  unknown. 

Brief  History  of  the  I  Region  Loci 

The  I  region,  which  contains  the  class  II  genes, 
has  had  a  relatively  short,  but  turbulent,  history.  As 
has  already  been  mentioned,  the  I  region  was  discovered 
and  mapped  by  the  differential  ability  of  inbred  and 
congenic  animals  to  respond  to  certain  simple  antigens 
(Benacerraf  and  McDevitt  1972;  Martin  et  al.  1971; 
McDevitt  and  Sela  1965;  McDevitt  et  al^  1972).  Even  then, 
the  I  region  gene  products  were  mistaken  for  being 
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related  to,  but  not  a  part  of,  the  MHC  and  somehow 
related  to  the  T  cell  receptor  (McDevitt  and  Chinitz 
1969). 

Historically,  five  subregions  were  defined  in  the  I 
region,  the  A,  B,  J,  E,  and  C  subregions  (Murphy  1981). 
The  B  locus  was  originally  postulated  by  Lieberman 
et  al.  (1972)  to  explain  what  appeared  to  be  a  response 
to  an  allotypic  determinant  on  an  IgG2a  molecule  known 
as  MOPC173.  Responses  to  several  other  antigens  were 
mapped  to  the  B  locus  including  lactate  dehydrogenase  B 
(Melchers  et  al^  1973),  staphlococcal  nuclease  (Lozner 
et  al.  1974),  oxazolone  (Fachet  and  Ando  1977),  and  H-Y 
(Hurme  et  al^  1978).  But  the  data  from  the  different 
laboratories  have  not  always  corroborated  the  existence 
of  the  B  locus,  and  several  alternative  explanations  have 
in  fact  been  offered  based  on  an  interplay  of  the  A  and  E 
loci  (Baxevanis  et  al.  1981).  However,  no  protein  product 
has  been  detected  and  sequencing  data  has  not 
demonstrated  the  presence  of  any  gene  corresponding  to 
the  B  locus  (Hood  et  al.  1983;  Steinmetz  et  al.  1982; 
Steinmetz  et  al.  1986). 

The  C  subregion  was  first  discovered  with  an  H-2h2 
anti-H-2h4  antiserum  (David  and  Shreffler  1974).  Other 
evidence  in  support  of  the  existence  of  the  C  subregion 
was  found  by  Rich  et  al^  ( 1979a ;  1979b)  with  the  presence 
of  C-specific  antibodies  that  reacted  with  a  suppressor 
factor  produced  in  a  mixed  lymphocyte  culture.  The  C 
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locus  was  mapped,  using  recombinant  inbred  strains, 
telomeric  to  the  E^  locus  and  centromeric  to  the  genes 
encoding  the  C4  component  of  complement,  other 
investigators  have  been  unable  to  confirm  many  of  the 
results  dealing  with  the  C  locus  and  so  guestion  its 
existence  (Juretic  et  al^  1981;  Livnat  et  al^  1973).  In 
the  most  current  molecular  cloning  data  of  this  region 
(Steinmetz  et  §jU  1986)  there  is  no  evidence  for  the  C 
locus  in  the  150  kb  of  DNA  telomeric  to  the  E^  locus, 
although  the  entire  chromosomal  segment  in  guestion  has 
not  been  characterized.  And  once  again,  no  protein 
product  has  been  isolated  from  the  C  locus.  The  existence 
of  both  the  B  and  the  C  loci  is  based  entirely  on  their 
possible  regulatory  effect  on  the  immune  response  of  the 
mouse . 

The  J  subregion  is  the  third  subregion  from  which 
no  protein  product  has  been  well-characterized,  although 
anti-J  antiserum  and  monoclonal  antibodies  have  been 
produced  by  several  laboratories  (Kanno  et  al.  1981; 
Murphy  1978;  Waltenbaugh  1981).  In  fact,  the  J  locus  has 
been  perhaps  the  most  publicized  single  gene  in 
immunology,  and  the  most  controversial,  other  than  the 
T  cell  receptor.  Thousands  of  papers  have  been  published 
on  either  the  J  product,  or  on  its  role  as  the  class  II 
element  controlling  the  T  suppressor  cells.  The  J 
subregion  was  originally  defined  by  reciprocal  antisera 
raised  between  inbred  congenic  mouse  strains  B10.A(3R) 
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and  B10.A(5R)  as  well  as  the  mouse  strains  B10.HTT  and 
B10.S(7R).  The  same  mouse  combinations  mapped  the 
location  of  the  J  subregion  between  the  A  and  E  loci 
(Murphy  et  al.  1976;  Murphy  1978).  These  alloantisera 
recognize  soluble  suppressor  factors  secreted  by  these 
cells  as  well  as  recognizing  polymorphic  determinants 
expressed  on  T  suppressor  cells  (Krupen  et  al.  1982; 
Murphy  1978;  Tada  et  al.  1976;  Taniguchi  et  al.  1980; 
Waltenbaugh  1981).  Although  no  protein  has  been 
positively  identified  for  the  J  locus,  Taniguchi  et  al. 
(1982)  report  finding  a  25,000  dalton  protein  using  an 
anti-J  monoclonal  antibody. 

In  the  first  extensive  DNA  level  characterization  of 
the  murine  I  region,  Steinmetz  et  al.  (1982)  found  no 
evidence  for  the  existence  of  the  J  locus  within  the  I 
region.  Due  to  a  hotspot  for  recombination  at  the  3'  end 
of  the  Ep  gene,  these  authors  were  able  to  map  the 
suspected  position  of  J  between  the  A  and  E  loci  and 
found  that  if  it  was  located  there  it  would  have  to  be 
encoded  by  less  than  3.4  kb  of  DNA.  In  further  DNA 
cloning  analysis  of  this  region  and  more  RFLP  mapping 
of  additional  intra-I  region  recombinants,  Kobori  et  al. 
(1984)  shortened  the  distance  down  to  about  2.0  kb, 
making  it  even  more  unlikely  that  J  might  be  encoded 
here.  Related  experiments  have  shown  that  cloned  DNA 
encompassing  this  critical  3 . 4  kb  segment  fails  to 
hybridize  to  RNA  from  J  positive  T  suppressor  cell  lines 
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(Kronenberg  et  al^  1983),  thus  making  unlikely  the 
presence  of  an  I  region  encoded  J  gene  product. 
Alternative  explanations  for  the  location  of  J  have  been 
offered  (Hayes  et  al^  1984;  Klyczek  et  al^  1984),  but 
have  not  been  substantiated. 

The  A  and  E  subregions  have  survived  the  despoilment 
of  the  I  region  which  occurred  with  the  onslaught  of 
molecular  analysis  of  the  I  region  DNA.  These  two 
subregions  contain  genes  which  encode  four  cell  surface 
protein  products  which  have  been  identified  by 
serological  and  biochemical  methods  (Jones  1977;  Uhr  et 
al.  1979).  The  A  subregion  contains  at  least  three  loci 
that  encode  class  II  molecules  which  are  expressed  on 
the  cell  surface:  Ap,  A^,  and  E0  (Jones  et  al^  1978).  The 
E  subregion  contains  a  fourth  loci  that  is  known  to 
encode  a  molecule  expressed  on  the  cell  surface,  E^ 
(Jones  et  al.  1978).  It  is  important  to  note  that  with 
the  molecular  characterization  of  the  genome  containing 
the  I  region,  the  nomenclature  of  I  "subregion"  is  no 
longer  appropriate.  The  term  originally  defined  the  I 
region  loci,  several  of  which  are  now  generally  believed 
to  have  been  artif actual  for  reasons  listed  above. 
Recombinational  events  are  more  accurately  represented 
when  viewing  the  class  II  genes  as  being  part  of  the 
continuum  of  DNA  versus  the  archaic  concept  of  subregion. 
Henceforth  in  this  literature  review  individual  loci 
shall  be  referred  to  by  their  gene  designation,  for 
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example,  Ag.  These  loci,  plus  other  class  II  loci 
recently  discovered  in  the  genome,  will  now  be  discussed 
in  more  detail. 

Cloning  and  Sequencing  of  the  Murine  Class  II  Genes 

The  cloning  and  sequencing  of  the  murine  class  II 
genes  were  based  in  part  on  technical  advances  and  new 
approaches  made  while  isolating  the  human  class  I 
(Ploegh  et  al^_  1980;  Sood  et  al^  1980),  and  class  II 
(Auffray  et  al^_  1982;  Korman  et  aJU  1982a,  1982b; 
Larhammar  et  al_;_  1982a;  Lee  et  al^  1982a;  Yang  et  al. 
1982)  genes.  Protein  sequence  comparisons  done  earlier, 
reviewed  by  Nathenson  et  al.  (1981),  have  already 
established  the  homology  between  humans  and  mice  when 
comparing  DNA  sequences  of  the  class  I  genes.  Similar 
work  on  the  class  II  gene  products  also  reveals  strong 
homologies  between  mice  and  human  (Allison  et  al.  1978; 
Cook  et  al.  1979).  More  indicative  of  the  evolutionary 
stability  of  the  class  II  gene  products  in  a  dynamic 
molecular  environment  is  the  maintenance  of  the  domain 
structure  as  the  basic  functional  unit  of  the  molecule. 

The  most  revealing  feature  of  the  class  II  proteins 
in  terms  of  their  evolutionary  origins  is  the 
aforementioned  domain  structure  and  their  sequence 
relatedness  to  other  immunological  molecules.  There  is  a 
consistent  correlation  between  all  the  class  II  genes  of 
each  structural  domain  being  encoded  by  a  separate  exon, 
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as  there  is  for  the  class  I,  ^-microglobulin,  the  Thy-1 
molecule,  and  antibody  genes  (Kaufman  et  al.  1984). 
Domain  structure  alone  does  not  indicate  homology  between 
proteins,  since  similar  domains  have  been  found  in  such 
proteins  as  superoxide  dismutase  (Richardson  et  al. 
1976),  but  taken  together  with  the  nucleotide  sequence 
homology  found  between  these  immunological  molecules 
(Benoist  et  al_j_  1983a;  Bregegere  et  al.  1981;  Korman  et 
al.  1982b;  Larhammar  et  al.  1982b;  McNicholas  et  al. 
1982;  Parnes  and  Seidman  1982;  Steinmetz  et  al.  1981), 
there  is  strong  evidence  for  the  existence  of  a  common 
ancestral  gene  (Peterson  et  al.  1975).  Other  similarities 
between  members  of  the  immunoglobulin  supergene  family 
include  similar  placement  and  size  of  the  disulfide 
bridges  and  RNA  splicing  according  to  the  GT/AG  rule 
(Hood  et  al^  1983).  The  T8  cell  surface  glycoprotein 
expressed  by  most  cytotoxic  T  lymphocytes  has  also  been 
determined  to  belong  to  the  immunoglobulin  supergene 
family  by  domain  structure  and  cDNA  sequencing  (Sukhatme 
et  al.  1985),  as  has  the  T4  molecule  (Maddon  et  al. 
1985).  Thus,  evolution  through  gene  duplication  and 
divergence  (Ohno  1970)  may  be  an  ancient  mechanism  for 
the  immune  system  gene  family. 

Although  the  murine  class  II  genes  have  an  exon- 
intron  organization  that  corresponds  to  the  domain 
organization  of  the  expressed  protein  product,  the  murine 
class  II  a  gene  structure  differs  from  that  of  the  murine 
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class  II  3  gene.  A  large  intron  separates  the  exons 
encoding  the  signal  peptide  from  the  first  domain  in  the 
class  II  a  gene,  and  the  3'  untranslated  region  is  split 
between  two  exons,  but  the  transmembrane  and  cytoplasmic 
regions  are  encoded  by  a  single  exon  (Benoist  et  al. 
1982;  McNicholas  et  al^  1982).  This  genetic  structure  is 
similar  to  that  of  the  murine  ^-microglobulin  gene 
(Suggs  et  al.  1981).  In  contrast,  the  murine  class  II 
(3  genes  have  a  large  intron  between  exons  encoding  the 
first  and  second  extracellular  protein  domains.  The 
transmembrane,  cytoplasmic,  and  3*  untranslated  regions 
are  split  over  three  exons  (Larhammer  et  al.  1983a;  Saito 
et  al.  1983),  more  similar,  though  not  identical,  to  the 
class  I  heavy  chain  gene  structure  (Malissen  et  al. 
1982).  The  genomic  structure  of  the  Ag  gene  can  be  seen 
in  figure  2 . 

As  mentioned  above,  because  of  all  of  the  common 
structural  and  seguence  homologies  between  the  members 
of  the  immunoglobulin  supergene  family,  there  is  a  strong 
possibility  that  each  has  all  evolved  from  a  common 
ancestral  gene.  It  is  important  to  keep  in  mind  that  one 
cannot  distinguish  between  convergent  and  divergent 
evolution  (Hood  et  al^_  1983).  The  membrane  proximal 
domains  of  these  molecules  have  the  most  sequence 
homology  and  are  therefore  even  more  likely  to  have  a 
common  origin.  But  the  nearly  identical  external  domain 
size  and  disulfide  bridge  placement  of  the  different 
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members  of  the  immunoglobulin  supergene  family  argues 
strongly  for  a  common  ancestral  gene  evolving  in  a 
divergent  manner  following  gene  duplication. 

Genome  Organization  of  the  Class  II  Genes 

The  first  evidence  at  the  DNA  level  of  the  linkage 
of  class  II  genes  was  made  by  Steinmetz  et  al.  (1982) 
with  their  cloning  of  about  230  kb  of  DNA  isolated  from 
a  BALB/c  sperm  DNA  cosmid  library.  The  cosmid  library 
was  first  probed  with  a  DRa  cDNA  probe  (characterized  by 
Wake  et  al.  1982),  and  then  probed  by  single  copy  genetic 
fragments  subcloned  from  contiguous  cosmids.  In  this 
manner,  Steinmetz  et  al^  were  able  to  "walk"  along  the 
chromosome  as  long  as  there  were  cosmid  clones  in  the 
genomic  library  that  contained  overlapping  fragments  of 
genomic  DNA,  identifying  approximately  200  kb  of  linked 
DNA  in  the  process.  The  telomeric  boundary  of  the  I 
region  was  defined  as  the  structural  gene  for  the  C4 
complement  component  mapping  about  90  kb  downstream  from 
Eq.   The  centromeric  boundary  of  the  I  region  was  not 
determined  in  this  particular  publication,  but  several 
other  important  discoveries  were  made.  First,  four  class 
II  genes  were  identified,  one  as  a  possible  pseudogene 
because  a  5'  probe  failed  to  hybridize  to  the  gene. 
Second,  the  BALB/c  genome  was  determined  to  contain  two 
a  and  four  to  six  3  genes,  a  finding  which  has  been  borne 
out  in  more  recent  work  (Widera  and  Flavell  1985).  Third, 
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Steinmetz  et  ajL_  (1982)  also  reported  that  the  E^  and  Eo 
genes  are  present  in  strains  of  mice  which  do  not  express 
an  E  molecule,  e.g. ,  the  b  and  s  haplotypes  which  express 
the  protein  in  the  cytoplasm,  and  the  f  and  2  haplotypes, 
which  do  not  express  the  protein  at  any  detectable  level. 
This  finding  has  led  to  more  work  on  control  of  gene 
expression  in  the  class  II  gene  system.  Fourth, 
correlation  of  the  molecular  map  with  the  serologically 
and  genetically  determined  map  of  the  I  region  led 
scientists  to  guestion  the  existence  and  location  of 
the  B  and  J  genes.  Finally,  a  recombinational  hotspot 
was  identified  where  nine  independently  generated 
recombinant  mouse  lines  were  found  to  all  have  recombined 
within  the  same  3.4  kb  of  DNA.  Kobori  et  al^  (1984)  have 
furthur  characterized  six  of  the  murine  I  region  MHC 
recombinants  using  southern  blot  DNA  analysis  to  limit 
the  recombination  region  in  these  strains  to  less  than 
2.0  kb  of  DNA.  This  2 . 0  kb  contains  part  of  the  intron 
between  the  first  and  second  protein  encoding  exons, 
and  part  of  the  second  domain  encoding  exon. 

Figure  3  shows  the  most  recent  concept  of  the  I 
region  at  the  DNA  level.  Other  class  II  genes  which  have 
been  characterized  recently  are  Ap2  ,    A^3 ,  and  E^- 
Larhammar  et  al^_  (1983a)  identified  A02  and  located  it  to 
be  about  20  kb  centromeric  to  Ag.  Larhammar  et  al. 
(1983b)  seguenced  the  genomic  A$2   of  the  b  haplotype 
isolated  from  cosmid  clone  19-101.  The  exon-intron 
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structure  of  Ag2  is  the  same  as  for  the  other  class  II 
3  genes.  The  predicted  amino  acid  sequence  of  Aq2,  as 
interpreted  from  the  nucleotide  sequence,  shows  only  up 
to  56%  homology  to  the  other  3  chains,  including  the 
human  3  chain  class  II  proteins.  These  other  3  chains 
typically  show  up  to  almost  80%  homology  to  each  other. 
On  this  basis,  the  Ag2  second  domain  sequence  was 
determined  to  be  the  most  divergent  member  of  the  class 
II  3  gene  family.  Larhammar  et  al.  (1983b)  also  cloned 
and  sequenced  a  cDNA  clone,  proving  transcription  of  Ao2 
does  occur,  although  it  was  not  detected  on  the  cell 
surface,  and  some  possible  splicing  errors  were  detected. 
When  Ag2  was  used  as  a  probe  to  hybridize  blots  of  other 
strains,  a  lesser  degree  of  polymorphism  was  detected. 

The  latest  class  II  gene,  and  possibly  the  last  in 
the  I  region,  is  Ap3.  Widera  and  Flavell  (1985)  isolated 
Ag3  from  a  b  haplotype  cosmid  library  and  were  able  to 
link  it  75  kb  telomeric  to  the  class  I  H-2K  region.  The 
nucleotide  sequence  of  the  32  domain  of  A33  has  homology 
to  the  immunoglobulin- like  domains  of  other  class  II 
genes,  and  83%  homology  to  the  human  SBS  gene.  An 
examination  of  the  nucleotide  sequence  also  showed  a 
deletion  of  8  nucleotides  which  makes  impossible  the 
translation  of  this  gene  into  a  functional  protein.  The 
existence  of  AQ3  in  another  haplotype  was  confirmed  by 
Steinmetz  et  al.  (1986)  with  their  cosmid  cloning  of  the 
BALB/c  A33.  Whereas  Widera  and  Flavell  (1985)  linked  the 
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K  region  with  Ap3 ,  Steinmetz  et  al.  (1986)  were  also  able 
to  link  A(33  with  the  rest  of  the  I  region,  effectively 
providing  a  600  kb  continuous  DNA  map  of  the  K  and  I 
region.  Therefore,  as  illustrated  in  Figure  3,  the  order 
of  the  genes  discussed  is  K2,  K,  Ap3,  Ag2#  Ap,  A^,  Eo, 
Ep2f  and  Ea.  In  addition,  Steinmetz  et  al.  (1986)  also 
localized  two  short  regions  of  DNA  which  had 
recombination  frequencies  of  0.6%  to  1.5%  between  genes 
from  Mus  musculus  castaneus  and  standard  laboratory  mouse 
strains  (Mus  musculus  domesticus) .  Such  hotspots  for 
recombination  may  be  instrumental  in  the  generation  of 
polymorphism  in  the  class  II  genes. 

Polymorphism  of  the  Class  II  Genes 

The  most  unusual  feature  of  the  MHC  in  the  murine 
system,  or  in  vertebrate  systems,  is  the  extensive 
polymorphism  of  certain  of  the  class  I  and  class  II 
genes.  Of  the  class  II  molecules,  the  0  chain  proteins 
have  been  known  to  be  the  most  polymorphic,  and  E^  the 
least  polymorphic  (Klein  et  al^  1983a).  The  polymorphic 
nature  of  the  class  II  genes  has  agreed  with  that 
found  in  the  proteins  in  general,  but  A^  has  been 
determined  to  be  more  polymorphic  than  originally  thought 
by  many  (Benoist  et  al^_  1983a).  As  mentioned  earlier, 
this  unique  degree  of  polymorphism  implies  a  unique 
biological  role  for  the  encoded  glycoproteins. 
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Biological  Role  of  Polymorphism 

The  class  II  molecules  are  involved  in  the 
communication  between  immunocompetent  (thymic  education 
aside)  cells  to  induce  and  maintain  a  defensive  reaction 
to  what  the  body  perceives  as  a  foreign  invasion.  The 
class  II  molecules  are  key  elements  in  the  activation  of 
an  immune  response  via  regulatory  T  lymphocytes. 

The  discovery  and  characterization  of  the  class  II 
molecules  have  already  been  described  in  detail.  The 
interaction  between  the  antigen,  the  class  II 
glycoprotein,  and  the  T  cell  receptor  determine  if  an 
animal  is  able  to  mount  an  immune  response  to  a 
particular  antigen.  The  T  cells  apparently  cannot 
recognize  free  antigen  as  the  B  cells  can  (Moller  1978; 
Moller  1980).  The  function  of  the  class  II  glycoprotein, 
then,  is  to  enable  the  T  lymphocytes  to  recognize  a 
foreign  antigen  so  that  they  can  respond  appropriately. 
This  process  is  known  as  MHC  restriction.  Because  the  T 
cell  receptor  only  recognizes  the  class  II  glycoprotein 
which  is  of  the  identical  allelic  haplotype  as  itself, 
the  process  is  also  sometimes  referred  to  as  I-region 
restriction  or  self -MHC  restriction  (Klein  et  al^_  1981; 
Nagy  et  al^  1981) . 

The  T  cell  receptor,  therefore,  must  recognize  and 
form  a  ternary  complex  with  two  ligands  (Schwartz  1985). 
One  of  these  ligands  is  the  antigen  itself,  which  is 
usually  a  partial  degradation  product  of  an  antigen 
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presenting  cell.  The  other  ligand  is  the  class  II  gene 
product  expressed  as  a  transmembrane  glycoprotein  present 
most  abundantly  on  B  lymphocytes  and  antigen  presenting 
cells  (Asano  et  al^_  1983;  Hammerling  et  al^_  1974;  Katz 
et  al^_  1973;  Kindred  and  Shreffler  1972;  Nagy  et  al. 
1981).  Each  member  of  the  ternary  complex  possesses  a 
precise  and  high  binding  affinity  for  each  other  member 
of  the  ternary  complex;  otherwise,  the  biological 
triggering  of  the  T  helper  cell,  and  consequently  the 
stimulating  of  an  antibody  response,  does  not  occur.  It 
is  in  this  specificity  of  binding  that  the  role  of 
polymorphism  of  class  II  glycoproteins  can  best  be 
understood. 

The  extent  of  the  polymorphism  of  the  class  II  gene 
products,  although  very  high,  is  not  nearly  enough  to 
explain  the  ability  of  the  class  II  glycoproteins  to 
control  immune  responsiveness  to  the  enormous  number  of 
foreign  antigens  an  animal  is  able  to  respond  to.  The 
precise  mechanism  by  which  the  class  II  glycoproteins 
trigger  specific  immune  responses  is  still  not  known. 
There  is  evidence  that  the  T  cells  can  differentiate 
class  II  gene  products  in  association  with  molecules 
which  are  subtle  structural  variants  as  with  insulin 
(Rosenthal  1978),  lysozyme  (Adorini  et  al^_  1979),  and 
cytochrome  c  (Solinger  et  al.  1979). 

The  polymorphic  nature  of  the  class  II  glycoproteins 
might  be  explained  by  the  fine  balance  the  immune  system 
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seems  to  maintain.  There  are  many  animal  models,  and 
human  models,  of  diseases  caused  by  the  immunocompetent 
cells  attacking  self.  Even  the  prevalent  existence  of  so 
many  allergies  and  asthmatics  in  the  human  population 
suggests  that  control  of  the  immune  system  is  relatively 
easily  thrown  off.  If  so,  then  the  presence  of  many 
haplotypes  in  a  population  would  mean  that  a  given  animal 
with  at  the  most  two  haplotypes  would  be  less  likely  to 
react  with  an  innocuous  antigen,  and  thus  lower  that 
animal's  selective  advantage.  However,  the  advantage  to 
the  population  at  large  of  having  many  alleles  to  best 
defend  the  species  against  a  threatening  plague  would 
be  of  tremendous  selective  advantage.  If  one  class  II 
glycoprotein  could  not  present  a  particular  dangerous 
antigen  to  the  immune  system,  then  perhaps  another  allele 
in  the  population  could  (Zinkernagel  1979).  Although 
selection  operates  on  the  individual  level,  mechanisms 
which  would  enhance  the  introduction  of  new  alleles  could 
have  a  selective  advantage.  The  proof  of  a  postulate 
such  as  the  one  suggested  above  awaits  appropriate 
experimental  design  and  statistical  analysis.  The 
possible  heterozygous  advantage  involving  class  II  genes 
also  needs  to  be  taken  into  account.  Nevertheless,  the 
existence  of  the  polymorphism  through  evolutinary  time 
suggests  its  importance  in  the  survival  of  the  species 
and  of  the  importance  of  the  mechanisms  which  have 
generated  and  maintained  the  polymorphism. 
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Mechanisms  of  Generating  Polymorphism 

If  the  class  II  genes  are  viewed  as  being  in  a 
dynamic  state  of  flux  in  evolutionary  time  rather  than 
being  static  structures,  then  visualizing  the  genetic 
mechanisms  which  have  generated  the  polymorphisms ,  and 
the  selective  pressures  which  have  maintained  them,  is 
more  revealing.  The  entire  immunoglobulin  gene  super- 
family,  which  includes  the  class  II  genes,  appears  to 
have  arisen  by  gene  duplication  and  divergence.  Two  of 
the  most  popular  possibilities  for  divergence  of  the 
class  II  genes  into  polymorphic  alleles  are  unequal 
crossing  over  and  gene  conversion. 

Unequal  crossingover  and  gene  conversion,  originally 
found  in  fungi  (Radding  1978),  are  mechanisms  whereby  DNA 
sequence  is  transferred  or  copied  from  one  gene  to 
another.  Although  by  definition  the  DNA  sequence  can  be 
transferred  from  and  to  genes  anywhere  in  the  genome,  it 
is  much  more  probable  to  occur  within  tandem  multigenic 
or  multiallelic  families  (Baltimore  1981;  Egel  1981; 
Robertson  1982;  Slightom  et  al.  1980).  Pairing  between 
partially  homologous  sequences  during  meiosis  or  mitosis 
would  occur,  followed  by  mismatch  repair  which  converts 
part  of  one  sequence  to  the  other.  The  primary  evidence 
for  gene  conversion  is  the  discovery  of  clusters  of 
substitutions,  especially  at  the  DNA  level.  While  these 
"tracts"  of  nucleotide  substitutions  have  been  clearly 
demonstrated  in  class  I  genes  (Mellor  et  al.  1983;  Weiss 
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et  al.  1983a;  Weiss  et  al.  1983b),  there  is  also  evidence 
(Mengle-Gaw  et  al.  1984;  Widera  and  Flavell  1984),  though 
not  as  thoroughly  documented,  for  a  similar  mechanism 
acting  on  the  class  II  genes. 

Regions  of  allelic  hypervariability  have  been 
reported  in  the  murine  A^  gene  (Benoist  et  al.  1983b), 
suggesting  that  this  gene  has  more  of  a  polymorphic 
nature  than  previously  thought  by  some  (Cullen  et  al. 
1976;  Klein  and  Figueroa  1981;  and  Klein  et  aJL  1981), 
though  a  few  scientists  had  evidence  for  an  unexpectedly 
high  degree  of  polymorphism  for  the  ^  gene  (Cecka  et  al. 
1979;  Cook  et  aJL  1979).  Benoist  et  al^  (1983b)  sequenced 
a  total  of  six  different  Aq  alleles,  including  the  k,  d, 
b,  f,  u,  and  q  haplotypes,  and  compared  their  cDNA 
sequences.  Not  only  did  they  find  a  surprising  degree 
of  polymorphism,  they  also  found  that  the  amino  acid 
substitutions  were  clustered  in  the  first  domain  exon. 
In  fact,  many  of  the  substitutions  were  localized  at  a 
few  highly  variable  positions  within  the  first  domain 
exon.  Also,  40  out  of  46  dinucleotide  changes,  which  are 
indicative  of  nucleotide  sequence  fluidity,  occur  in  the 
first  domain  exon.  A  translation  of  the  cDNA  nucleotide 
sequence  into  the  corresponding  amino  acid  sequence  for 
the  six  haplotypes  reveals  not  only  the  polymorphism  of 
domain  one,  but  the  corresponding  Kabat-Wu  variability 
plot  (Kabat  et  al.  1979)  also  shows  two  regions  of 
"allelic  hypervariability"  at  residues  11-15  and  at 
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56-57.  These  regions,  however,  are  not  nearly  as  variable 
as  the  immunoglobulin  hypervariable  regions. 

The  polymorphism  of  Aq  still  leaves  open  the 
question  of  how  it  was  generated.  Because  A^  is  not  a 
member  of  a  large  gene  family  it  might  not  be  a  good 
candidate  for  gene  conversion,  although  one  must  still 
consider  interallelic  gene  conversion.  Benoist  e_t  al. 
(1983b)  do  mention  the  likeliness  of  interallelic 
conversion  in  heterozygotic  wild  mice,  which  will  be 
discussed  later  in  this  dissertation,  but  they  do 
not  feel  it  sufficent  to  explain  the  generation  of 
polymorphism  in  A^.  The  A^  gene  lacks  the  clustering  of 
nucleotide  substitutions,  and  a  clear  donor  of  sequence 
material  has  not  been  detected  as  yet,  to  make  it  a  good 
candidate  for  gene  conversion  (Benoist  et  al.  1983a). 
They  offer  instead  a  hypothesis  of  a  gene  duplication 
event  followed  by  one  of  the  copies  subject  to  slow 
drift,  the  other  copy  acquiring  a  degree  of  sequence 
instability  which  would  lead  to  a  high  rate  of  point 
mutations.  Data  presented  in  the  results  of  this 
dissertation  tend  to  support  some  type  of  conversion 
event  over  simply  the  accumulation  of  point  mutations. 

Regions  of  allelic  hypervariability  have  also  been 
reported  for  E^  (Mengle-Gaw  and  McDevitt  1983).  Again, 
these  regions  were  found  only  in  the  first  domain  and 
correspond  to  the  hypervariable  regions  found  both  in 
the  alleles  at  a  particular  locus,  and  between  (3  loci. 
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Clusters  of  polymorphism  separated  by  sequences  of 
nucleotide  homology  found  both  among  the  Eq  alleles  and 
between  the  0  loci  suggested  to  the  authors  the 
possibility  of  generation  of  this  polymorphism  by  a  gene 
conversion  type  event. 

Genomic  clones  from  three  different  haplotypes,  the 
b,  d,  and  k  haplotypes,  have  been  isolated  and  their  DNA 
sequences  compared  to  one  another  (Choi  et  al.  1983). 
While  the  overall  structural  organization  of  these 
genomic  clones  was  determined,  unfortunately  only  the 
exons  were  sequenced  at  the  time.  The  authors  determined 
that  there  is  a  concentration  of  amino  acid  substitutions 
in  the  amino  terminal  portion  of  the  encoded  molecule 
and  that  the  pattern  of  nucleotide  substitutions  is 
consistent  with  multiple  independent  mutational  events. 
Their  restriction  map  analysis  of  sequences  flanking  the 
exons  suggests  that  there  may  be  large  differences 
between  the  haplotypes,  which  agrees  with  the  data 
presented  in  this  dissertation.  They  interpret  their 
data  as  being  inconsistent  with  gene  conversion,  but 
do  not  take  into  consideration  the  low  number  of 
haplotypes  they  analyzed. 

Evidence  for  gene  conversion  in  a  class  II  3  gene 
has  been  reported  by  Mengle-Gaw  et  al_;_  (1984).  They  have 
isolated  an  alloreactive  T  cell  clone,  4.1.4,  that 
recognizes  a  determinant  present  on  both  E_eb  and  Aobm12. 
Comparison  of  the  nucleotide  sequence  of  Apb  (Choi  et  al. 
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1983)  and  k^hm12    (Mclntyre  and  Seidman  1984)  to  the  cDNA 
sequence  of  E^h   revealed  that  the  bml2  sequence  is 
identical  to  the  Ep"  sequence  in  the  reqion  where  it 
differs  from  Ag.  The  particular  reqion  where  the 
conversion  event  may  have  occurred  includes  three 
nucleotides  in  a  clustered  reqion  of  14  nucleotides 
between  sequence  codinq  for  amino  acids  67-71.  This 
reqion  is  also  flanked  by  reqions  of  exact  homoloqy  which 
extend  20  nucleotides  5'  and  9  nucleotides  3".  These 
flankinq  reqions  may  provide  stabilization  of 
heteroduplex  formation  between  the  qenes,  which  miqht 
potentiate  sequence  transfer.  The  T  cell  clone  4.1.4  was 
found  to  recoqnize  a  determinant  shared  by  A«bm12  and 
E_@  ,  so  the  possible  qene  conversion  event  would  have 
occurred  in  a  functional  zone.  Previous  information  which 
led  to  this  interest  in  the  bml2  mutation  includes 
qenetic  mappinq  of  the  bml2  mutation  to  within  the  Aob 
qene  (Hansen  et  al^_  1980),  and  tryptic  peptide  data 
showinq  the  bml2  mutant  to  differ  from  its  C57BL/6  parent 
only  in  its  Ag  polypeptide  (Lee  et  al.  1982b;  McKean 
et  al^  1981) . 

There  is  tremendous  difficulty  in  distinquishinq 
between  qene  conversion  and  unequal  crossinq  over  as 
mechanisms  of  the  qenetic  exchanqes  in  the  MHC.  The 
discovery  of  gene  conversion  in  fungi  was  only  possible 
because  the  products  of  a  single  meiosis  in  some  species 
remain  in  a  tightly  clustered  tetrad  in  which  mendelian 
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ratios  are  directly  detectable  (Baltimore  1981;  Radding 
1978).  A  change  in  gene  number  might  be  expected  in 
unequal  crossover,  but  if  the  crossover  event  took  place 
totally  within  the  genes,  then  one  might  find  an 
insertion  or  deletion  of  genetic  sequence  as  the  only 
evidence,  which  is  something  our  laboratory  is  looking 
for  in  the  intron  between  the  first  and  second  protein 
coding  domains.  Steinmetz  et  al.  (1982)  have  even 
postulated  that  unequal  crossover  may  occur  using 
pseudogenes  as  a  genetic  reservoir  for  polymorphic 
sequence  material.  Possible  evidence  for  gene  conversion 
at  the  DNA  sequence  level  is  the  strong  homology  seen 
in  the  flanking  regions  of  suspected  conversion  events; 
perhaps  such  sequences  have  been  selected  for  indirectly 
within  introns  as  shuttle  elements  to  continually 
generate  polymorphism  on  an  evolutonary  timescale.  Still, 
there  are  now  three  known  Ag  sequences,  as  well  as  a  very 
large  number  of  alleles,  for  generation  of  diversity  in 
A3,  and  there  is  no  rule  that  requires  one  mechanism  to 
operate  for  all  class  II  genes  or  that  requires  only  one 
mechanism  to  generate  that  diversity.  More  nucleotide 
sequence  information,  especially  in  the  introns  of  the 
class  II  genes,  should  do  much  to  elucidate  the 
mechani sms  invo 1 ved . 

Mechanisms  for  the  generation  of  polymorphism  should 
take  into  account  the  variable  and  conserved  tracts 
within  the  I  region  characterized  by  Steinmetz  et  al. 
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(1984).  Single  copy  probes  were  isolated  from  the  class 
II  region  of  a  BALB/c  library  and  were  used  to  screen  DNA 
cosmid  libraries  of  AKR  and  B10.WR7,  haplotypes  H-2k  and 
H-2wr7  respectively.  The  isolated  clones  were  aligned  to 
provide  a  nearly  continuous  stretch  of  DNA  through  the  I 
region  of  the  three  haplotypes,  which  was  restriction 
endonuclease  (RE)  mapped  and  oriented.  Using  probes 
spanning  the  I  region  in  a  southern  blot  analysis,  a 
variable  tract  was  found  in  the  left  half  of  the  I 
region,  and  a  conserved  tract  in  the  right  half,  with 
the  dividing  point  being  in  the  middle  of  the  Eq  gene, 
probably  overlapping  the  hot  spot  for  recombination  in 
the  middle  of  the  Eg  gene.  The  Ap,  A^,  and  Ep  genes, 
which  show  extensive  polymorphism,  are  located  in  the 
variable  tract,  whereas  the  much  less  polymorphic  E^  gene 
is  located  in  the  conserved  tract.  Noncoding  seguences 
located  in  the  variable  tract  were  found  to  be  just  as 
polymorphic,  or  often  more  so,  than  the  coding  regions 
in  the  variable  tract.  Again,  only  more  nucleotide 
sequence  information  is  likely  to  elucidate  the 
mechanisms  operating  to  generate  and  maintain  the 
polymorphism  in  the  I  region. 

The  hotspots  for  recombination  are  of  special 
interest  in  the  generation  of  polymorphism  in  the  I 
region.  The  recombination  rates  may  even  be  strain 
dependent.  Shiroishi  et  aJU  (1982)  examined  a  congenic 
mouse  strain,  B10.MOL-SGR,  which  has  an  H-2wm7  haplotype 
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bred  onto  a  C57BL/10  background.  This  H-2  haplotype,  of 
Mus  musculus  molossinus  origin,  tremendously  enhanced 
recombination  rates  between  the  K  and  A  loci.  A  similar 
dramatic  increase  in  specific  recombination  rates  has 
been  reported  in  another  wild  mouse  haplotype  (Steinmetz 
et  al.  1986).  Two  haplotypes  from  Mus  musculus  castaneus 
(CAS3  and  CAS4)  showed  recombination  at  the  same  high 
freguency,  0.6%-1.5%,  as  was  seen  in  Mus  musculus 
molossinus  derived  MHC  genes. 

Steinmetz  et  al.  (1986)  went  on  to  seguence  the 
intron  between  the  first  and  second  protein  coding  domain 
of  the  Ep  gene,  which  probably  contains  the  hotspot 
region,  and  found  that  the  seguence  contained  a  CAGG 
tetramer  repeated  in  tandem  22  times,  if  a  mismatch  of 
one  nucleotide  is  allowed.  The  seguence  has  some  homology 
to  the  lambda  Chi  seguence,  which  promotes  recombination, 
but  the  homology  is  not  very  strong.  A  much  stronger 
degree  of  homology  was  found  to  the  core  seguence  of  the 
hypervariable  minisatellite  regions  found  in  human  DNA 
(Jeffreys  et  al.  1985).  These  regions  could  generate 
allelic  variability  by  facilitating  unegual  crossover 
events  during  meiosis,  or  perhaps  even  by  initiating 
a  gene  conversion  event. 

Control  of  expression  of  the  class  II  genes  may  also 
play  a  role  in  their  generation  of  diversity,  either  by 
differential  control  of  expression,  or  by  polymorphism  in 
the  control  elements  themselves.  Some  standard  laboratory 
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inbred  mouse  strains  carry  mutations  that  cause  failure 
of  expression  of  the  class  II  E  molecule  on  the  cell 
surface  (Jones  et  al^  1978,  1981).  These  mutations  can 
be  of  any  one  of  three  types  (Hyldig-Nielsen  et  al.  1983; 
Mathis  et  al^_  1983):  the  H-2b  and  H-2S  haplotypes  have  a 
deletion  in  the  E^  gene,  the  H-2f  haplotype  makes  an  E^ 
mRNA  of  aberrant  size,  and  the  H-2^  haplotype  has  a 
defect  in  RNA  processing  or  RNA  stability.  The  lack  of  a 
cell  surface  expressed  E  molecule  for  any  reason  is 
referred  to  as  an  E°  mutation.  The  E°  mutations  have  been 
identified  in  over  50%  of  the  t  bearing  strains  (Nizetic 
et  al.  1984).  Eighteen  t  haplotype  carrying  strains  have 
been  found  to  be  E°  by  Dembic  et  aJL  (1984).  Three 
strains,  CR0437,  tw2 ,  and  t°  were  found  to  transcribe 
E  but  do  not  make  a  functional  protein.  All  fifteen  other 
E°  strains  had  a  deletion  encompassing  the  promoter 
region,  the  RNA  initiation  site,  and  the  first  exon, 
which  amounts  to  an  approximately  650  bp  deletion.  The 
role  these  mutations  might  play  in  the  polymorphism  of 
class  II  genes  is  just  now  getting  underway. 

In  the  human  system,  there  are  cell  lines  which  have 
specifically  lost  expression  of  all  class  II  molecules 
(Levine  et  aJU  1985).  The  cell  line  6.1.6  is  a  variant 
of  a  normal  lymphoblastoid  line  which  has  been  shown  to 
have  a  regulatory  defect  in  class  II  gene  expression 
(Gladstone  and  Pious  1978,  1980;  Levine  and  Pious  1984). 
P30  is  a  partial  revertent  of  the  6.1.6  cell  line.  Levine 
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et  al.  (1985)  used  southern  and  northern  blotting  of 
these  two  cell  lines  to  show  evidence  that  class  II  and 
Ii  (I  invariant)  chain  expression  may  be  linked.  The 
characterization  and  polymorphic  nature  of  the  regulatory 
elements  of  class  II  genes  is  just  beginning. 

Variation  in  Wild  Mice 

The  major  purpose  of  this  dissertation  is  to 
address  the  guestion  of  how  the  generation  of  the 
polymorphism  of  the  MHC  class  II  genes  arose  and  how 
this  polymorphism  is  maintained.  To  address  this 
guestion  realistically  reguires  an  understanding  of  the 
evolutionary  relationships  of  the  model  system  being 
studied.  The  more  thorough  the  understanding  of  the 
strain  development  of  the  system,  the  more  informative 
the  study. 

A  major  limitation  of  many  previous  studies  of  the 
extensive  genetic  polymorphism  of  the  murine  class  II 
genes  is  that  only  a  limited  number  of  class  II  alleles 
have  been  studied,  and  nearly  all  of  these  come  from  the 
standard  laboratory  inbred  strains  of  mice.  These  strains 
were  derived  from  a  limited  number  of  sources  with  a  high 
degree  of  interbreeding  early  in  their  development. 
As  such,  they  represent  a  highly  biased  sampling  of 
the  mouse  population  and  an  artifical  collection  of 
considerable  genetic  homogeneity  (Ferris  et  al.  1982; 
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Klein  1974).  Wild  mouse  populations  have  a  relatively 
high  degree  of  genetic  variation,  particularly  at  the 
H-2  complex,  when  compared  to  the  standard  laboratory 
inbred  strains  of  mice.  It  is  this  variability,  generated 
and  maintained  through  natural  selection,  which  makes 
wild  mice  a  near  ideal  model  for  the  study  of  the 
genetics  of  class  II  polymorphism.  In  turn,  the  class 
II  polymorphism  is  a  near  ideal  model  to  study  the 
evolution  of  a  species. 

A  useful  definition  of  wild  mice  is  a  population 
whose  reproduction  is  not  controlled  directly  by  humans. 
(Bruell  1970).   This  study  will  examine  the  polymorphic 
nature  of  the  class  II  genes  at  the  DNA  level  in  mice  of 
wild  mouse  populations  of  different  subspecies  and 
geographic  origins  as  well  as  the  standard  laboratory 
inbred  strains.  For  this  reason  it  is  of  major  importance 
to  understand  wild  mice  as  a  genetic  model. 

Natural  History  of  Wild  Mice 

Basic  to  the  understanding  of  the  evolutionary 
implications  of  the  selection  process  on  wild  murine 
class  II  gene  products  is  a  rudimentary  understanding  of 
the  natural  history  of  wild  mice.  The  degree  of 
association  of  the  wild  mice  with  humans  can  be  used 
to  distinguish  three  groups  (Sage  1981).  Aboriginal  mice 
live  predominantly  unassociated  with  human  dwellings  or 
food  sources.  Commensal  mice  live  in  close  association 
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with  human  buildings  and  food  supplies,  while  feral  mice 
have  made  a  transition  from  the  commensal  stage  back  to 
an  aboriginal  existence.  Much  of  what  is  known  about  the 
natural  history  of  wild  mice  has  been  learned  from 
studies  on  commensal  mice. 

The  term  house  mouse  is  defined  here  because  it 
describes  essentially  all  wild  mice  used  in  this 
dissertation  research.  House  mouse  literally  refers  to 
the  commensal  relationship  between  human  dwellings  and 
certain  species  of  mice.  The  number  of  species  comprising 
what  we  call  the  house  mouse  varies  depending  on  the 
person  defining  the  term.  In  this  dissertation,  the  house 
mouse  shall  be  split  into  seven  species  and  subspecies 
as  per  Joe  Marshall  (1981).  These  species  include  the 
commensal  mice  Mus  musculus  domesticus,  Mus  musculus 
musculus,  Mus  musculus  castaneus,  and  Mus  musculus 
molossinus,  and  the  closely  related  aboriginal  mice 
Mus  hortulanus,  Mus  spretus .  and  Mus  abbotti.  From  fossil 
evidence,  nuclear  genetic  variation,  and  mitochondrial 
genetic  variation,  it  has  been  estimated  that  the 
commensal  association  between  humans  and  mice  began  more 
than  a  million  years  ago  (Ferris  et  al^_  1983). 

The  native  distribution  of  the  wild  mouse  species 
ranges  across  Europe,  North  Africa,  and  northern  Asia. 
M.m.  domesticus  and  M.m.  castaneus,  two  commensal 
species,  have  followed  man  into  North  and  South  America, 
Australia,  and  southeastern  Africa,  presumably  as 
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stowaways  on  sailing  ships.  Thus,  most  of  the  standard 
laboratory  mice  of  M.m.  domesticus  origination  were 
already  introduced  in  the  New  World.  The  three  related 
aboriginal  species,  M.  hortulanus ,  M.  spretus,  and 
M.  abbotti,  have  a  native  distribution  in  Europe  and 
Asia  Minor.  The  mound-building  species,  M.  hortulanus. 
is  restricted  to  the  steppe  grasslnd  regions  of  the 
Carpathian  basin  and  the  Ukraine  (Petrov  1979). 
M.  spretus  is  found  in  the  warmer  parts  of  the  western 
Mediterranean  regions  from  France  to  Libya,  and  M. 
abbotti  is  found  in  southeastern  Europe  abd  Asia  Minor, 
although  its  geographic  distribution  is  less  well 
characterized  (Sage  1981).  Distribution  of  these  three 
aboriginal  species  is  consistent  with  patterns  of  other 
animal  and  plant  groups,  suggesting  that  their  present 
locations  were  determined  by  natural  factors,  not  humans, 
as  opposed  to  M.  domesticus. 

The  western  European  house  mouse,  M.  domesticus,  has 
the  most  diverse  geographic  distribution  of  the  house 
mouse  species  and  has  provided  the  most  information  about 
the  range  of  genetic  variability  of  the  house  mouse 
species.  This  species,  due  to  its  occupation  of 
buildings  and  sailing  ships  during  an  era  of  worldwide 
colonization,  established  founding  populations  in  areas 
as  diverse  as  the  Americas,  Australia,  and  varied 
temperate  and  tropic  Pacific  island  chains.  M.m. 
domesticus  may  be  a  more  advanced  member  of  the  genus 
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based  on  its  great  adaptability  and  spectacular  variation 
in  color  which  matches  its  various  geographic 
environments  (Marshall  1981).  Many  studies  have  been 
carried  out  where  mice,  usually  M.m.  domesticus,  have 
been  introduced,  a  fact  which  should  be  kept  in  mind 
when  reviewing  the  older  studies  (Schwarz  and  Schwarz 
1943).  This  problem  is  poteniated  by  mice  which  go  feral 
after  colonizing  a  new  land,  thus  subject  themselves  to 
new  and  different  natural  selective  pressures. 

The  native  distribution  of  the  house  mice  species 
has  not  been  thoroughly  documented,  but  some  informative 
observations  have  been  made  (Sage  1981).  M.  spretus,  for 
instance,  is  native  to  western  Europe  and  North  Africa, 
but  has  been  found  in  agricultural  fields,  often 
cornfields,  in  Spain  and  France,  and  grasslands  in  North 
Africa.  This  species  has  been  found  inside  buildings  in 
at  least  one  instance  (Sage  1981).  M.  hortulanus .  a 
well-studied  aboriginal  species  (Mikes  1971),  has  been 
found  in  grain  fields  and  some  native  steppe  grasslands. 
Whether  or  not  it  inhabits  buildings  is  still 
questionable.  Information  on  the  natural  habitats  of 
M.  abbotti  is  sparse  (Osborn  1965).  They  have  been 
reported  in  agricultural  habitats  in  southern  Georgia, 
U.S.S.R.,  in  grain  fields  and  bamboo  groves  in  Turkey, 
and  adjacent  to  cornfields  in  southern  Yugoslavia. 

The  more  commensal  of  the  house  mouse  species  are 
most  often  found  associated  with  human  buildings,  but  not 
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always.  M.  castaneus  is  found  indoors  in  Malaya  (Harrison 
1955),  India  (Srivastva  and  Wattal  1973),  Indonesia 
(Hadi  et  al^  1976),  and  Nepal  and  Thailand  (Marshall 
1977).  In  fact,  it  has  not  been  reported  outdoors  in 
these  areas  and  may  be  the  most  commensal  of  the  house 
mouse  species.  M.m.  molossinus  has  been  found  in  houses, 
farms,  cultivated  fields,  and  even  along  river  levees  in 
Japan,  as  well  as  abandoned  agricultural  fields  in  Korea 
(Hamajima  1962;  Jones  and  Johnson  1965),  suggesting  that 
it  is  less  of  an  obligate  commensal  than  M.m.  castaneus. 
The  native  range  of  M.m.  musculus  includes  central  Asia 
to  northeastern  Europe.  Its  microhabitats  vary  from 
inside  buildings  and  haystacks  in  much  of  northern  Russia 
and  central  Europe  (Pelikan  1974;  Romanova  1970;  Zejda 
1975)  to  agricultural  fields  and  meadows  in  Denmark 
(Ursin  1952).  In  Sweden  this  species  has  been  reported 
in  natural  wild  locations  independent  of  any  human 
influence  whatsoever  (Zimmerman  1949). 

M.m.  domes ticus  is  presently  found  throughout  the 
world,  although  it  is  an  adventive  species  in  most  of 
these  areas.  Its  native  range  extends  from  Nepal  to  North 
Africa  and  western  Europe.  Within  its  native  range  it 
can  be  found  in  habitats  as  diverse  as  agricultural 
fields  to  barren  stony  ravines  isolated  from  human 
settlements,  especially  in  Afghanistan  and  Pakistan 
(Gaisler  1975;  Hassinger  1973;  Roberts  1977).  In  the 
desert  of  the  south  Arabian  peninsula  it  has  been 
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found  living  in  burrows  of  sand  rats  (Harrison  1972). 
The  versatile  adaptability  of  this  species  is 
demonstrated  by  unusual  commensal  habitats  it  has 
occupied  such  as  coal  mines  (Philip  1938)  and  frozen 
meat  lockers  (Mohr  and  Dunker  1930).  M.m.  domesticus  is 
at  least  as  versatile  in  non-native  lands,  and  has  been 
found  in  environments  such  as  salt  marshes  (Breakey  1963) 
and  grasslands  (Pearson  1963)  to  the  Andes  mountains 
(Harland  1958),  although  it  is  predominantly  a  commensal 
species.  It  is  worth  noting  that  it  has  not  been  reported 
in  woodland  forests  in  Europe,  nor  in  the  Americas, 
although  it  has  been  found  to  occupy  the  native  silver 
beech  forests  in  New  Zealand  (Taylor  1978). 

Interspecies  competitive  interactions  are  difficult 
to  study  in  rodents  in  their  native  habitats.  The  most 
thoroughly  studied  case  remains  one  involving  two 
species,  M.m.  domesticus  and  the  vole  Microtus 
californicus  in  the  California  grassland  ecosystems 
(Lidicker  1966).  A  population  of  approximately  12,000 
mice  on  an  island  was  extinguished  within  one  year  after 
the  introduction  of  a  small  number  of  voles.  DeLong 
(1966;  1967)  studied  two  enclosed  populations  of  mice, 
one  group  with  the  presence  of  voles,  one  without.  The 
population  of  mice  in  the  enclosure  with  the  voles  has 
a  significantly  lower  survival  rate  for  postnatal, 
preweaning  mice.  Lidicker  (1966)  also  found  that  the 
voles  dominated  house  mice  in  94%  of  their  encounters. 
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DeLong  and  Lidicker ' s  studies  are  actually  some  of  the 
few  experimental  approaches  in  this  area.  These  rodent 
interactions  demonstrate  part  of  the  natural  selection 
process  when  a  new  species  enters  a  territory.  How  these 
interactions  affect  the  evolution  of  a  native  species 
over  hundreds  of  thousands  of  years  remains  to  be 
determined. 

Variation  of  Non-MHC  Features  in  Wild  Mice 

Factors  affecting  the  evolution  of  the  murine  MHC 
class  II  molecules  are  probably  numerous.  The  wild 
mice  are  an  excellent  system  to  study  evolution  of 
morphological  features,  protein  structure,  and  DNA 
structure.  The  morphological  features  of  the  wild  mice 
have  been  especially  instrumental  in  organizing  the 
phylogenetic  relationships  of  the  different  species  of 
Mus  while  anatomical  features  such  as  dental  structure 
(Bader  1965;  Van  Valen  1965),  skull  shape  (Hussain  et  al. 
1976),  and  relative  tail  length  and  foot  size  (Ursin 
1952)  have  all  contributed  to  the  classification  scheme. 
Relative  tail  length  has  also  been  a  useful  feature  in 
the  classification  of  wild  mice,  particularly  long  tail 
length  of  M.m.  domesticus,  because  of  a  genetic  region 
known  as  the  t  complex,  which  complicates  tail  length 
inheritance. 

Color  variation  as  a  morphological  feature  was 
critical  in  establishing  the  mouse  as  an  excellent  model 
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in  the  twentieth  century  to  study  inherited  traits. 
Geographic  factors  and  microhabitat  have  played  major 
roles  in  determining  resulting  coat  color  for  the  species 
in  their  native  ranges.  There  is  a  notable  polymorphism 
of  coat  color,  particularly  in  ventral  coloration, 
detectable  in  some  species  of  mice  such  as  M.m. 
molossinus  (Hamajima  1964)  and  M.m.  musculus  (Serafinski 
1965).  The  coat  color  patterns  are  important  for 
genetists  because  of  their  great  utility  as  genetic 
markers,  but  also  because  a  multifactorial  nature  has 
been  shown  to  be  involved  (Falconer  1947).  Genes  on  five 
or  more  chromosomes  have  been  found  controlling  melanism 
(Radbruch  1973).  Just  as  the  coat  color  genes  are 
important  markers  for  geneticists,  coloration  affecting 
natural  selection  and  survival  will  influence  the 
polymorphism  of  some  of  the  biochemical  factors. 

Variation  in  the  proteins  in  wild  mice  has  been 
assayed  most  commonly  with  electrophoresis  and  serology 
reviewed  by  Sage  (1981).  Many  variant  forms  of  proteins 
have  been  localized  to  a  particular  chromosomal  position 
(Womack  1979),  but  the  function  of  these  proteins  has  not 
always  been  identified.  Protein  variation  in  wild  mouse 
populations  has  also  proved  useful  in  learning  about  the 
heterozygosity  levels  in  M.  domesticus  populations  around 
the  world  (Berry  and  Peters  1977;  Rice  and  O* Brian  1980; 
Sage  1978).  Such  studies  have  aided  in  the  classification 
of  the  different  subspecies  of  wild  mice  (Bonhomme  et  al. 
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1978;  Minezawa  et  al.  1979).  One  example  of  how  protein 
variants  have  led  to  important  discoveries  in  evolution 
and  ecology  is  the  discovery  of  the  hybrid  zone  in  Europe 
by  Selander  (Selander  et  aJU  1969;  Hunt  and  Selander 
1973).  A  zone  of  contact  between  M.m.  domesticus  and 
M.m.  musculus  runs  across  the  Jutland  peninsula  in 
Denmark  and  continues  through  the  eastern  part  of  West 
Germany.  The  hybrid  zone  is  as  narrow  as  20  kilometers 
in  some  places,  and  has  possibly  been  in  existence  for 
5000  years.  Free  interbreeding  occurs  between  the 
"semispecies"  within  the  zone,  but  not  on  either  side 
of  it.  M.m.  domesticus  alleles  have  been  detected  within 
the  M.m.  musculus  populations  within  the  hybrid  zone,  but 
not  vice  versa,  perhaps  reflecting  social  dominance  of 
M.m.  domesticus  over  M.m.  musculus  (Thuesen  1977).  While 
selection  operates  most  often  at  the  protein  level,  this 
dissertation  will  examine  the  DNA  coding  for  variation 
in  a  specific  group  of  proteins. 

Variability  in  chromosome  structure,  once  thought 
nonexistent,  has  been  discovered  to  be  quite  prevalent 
in  certain  regions  of  the  world,  e.g.  Italy  and 
Switzerland  (Gropp  et  al^  1969;  1970;  1972).  The 
previously  "normal"  chromosomal  complement  was  thought 
to  be  20  pairs  of  acrocentric  chromosomes.  In  an 
excellent  review  article  by  the  discoverer  of 
Robertsonian  translocations  (Gropp  and  Winking  1981), 
Gropp  and  Winking  describe  the  presence  in  wild  mouse 
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populations  of  metacentric  chromosomes  formed  by  the 
joining  of  two  acrocentric  chromosomes. 

Studies  on  the  variability  of  mitochondrial  DNA 
sequence  in  various  species  of  house  mice  using  an  RFLP 
analysis  were  first  reported  by  Yonekawa  et  al.  (1980). 
The  25  standard  laboratory  strains  they  analyzed  showed 
no  variation  and  were  identical  with  a  sample  of  wild 
M.m.  domesticus  from  Canada,  but  the  patterns  from 
M.m.  castaneus  and  M.m.  molossinus  were  very  different. 
Varibility  of  mitochondrial  DNA  within  M.m.  molossinus 
populations  appears  to  limited. 

An  extensive  analysis  of  mitochondrial  DNA  evolution 
in  208  mice  by  Ferris  et  al_;_  (1983)  has  reinforced  the 
phylogenetic  classification  scheme  of  Marshall  and  Sage 
(1981)  for  the  seven  Mus  species  and  subspecies  of  house 
mice  discussed  here.  An  RFLP  analysis  of  the 
mitochondrial  DNA  of  four  commensal  and  three  aboriginal 
species  of  house  mice  and  the  standard  laboratory  mice 
led  to  the  construction  of  evolutionary  trees  on  the 
basis  of  mitochondrial  polymorphisms.  These  evolutionary 
trees  emphasized  the  distinctiveness  of  M.m.  domesticus 
from  the  other  commensal  species  of  mice.  All  50  of  the 
standard  laboratory  mouse  strains  analyzed  were  found 
to  be  M.m.  domesticus.  The  mitochondrial  evolutionary 
tree  also  reinforces  that  the  three  European  aboriginal 
species  of  mice  which  have  been  discussed,  M.  spretus , 
M.  abbotti,  and  M.  hortulanus .  differ  substantially  from 
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the  commensal  mouse  species  and  are  each  an  individual 
species  of  Mus. 

The  first  commensal  mice  may  have  begun  their 
relationship  with  humans  one  to  two  million  years  ago, 
assuming  the  rate  of  mutational  divergence  in 
mitochondrial  DNA  is  between  2%  and  4%  per  million  years 
(Ferris  et  al.  1983).  Mitochondrial  DNA  comparisons 
between  mammals  whose  divergence  times  have  been 
estimated  from  fossil  records  (Brown  et  al.  1979; 
Brown  et  al.  1982;  Ferris  et  al.  1981;  Upholt  and  Dawid 
1977)  have  provided  this  estimate  of  mitochondrial  DNA 
divergence.  This  estimate  of  commensalism  between  mice 
and  humans  fits  with  Sage's  (1981)  protein  comparisons 
and  may  correlate  with  Mus  species  divergence. 

The  "t"  Complex  in  Wild  Mice 

The  term  t  complex  indicates  the  part  of  the 
chromosome  which  is  occupied  by  a  complete  t  haplotype. 
Occurring  in  a  frequency  of  10%  (Artzt  et  al.  1985) 
to  40%  in  most  of  the  sampled  wild  mouse  populations 
(Dembic  et  al^  1984),  t  haplotypes  are  structurally 
variant  forms  of  a  segment  of  murine  chromosome  17. 
Mouse  t  haplotypes  are  thoroughly  reviewed  in  a  recent 
article  by  Silver  (1985).  When  first  discovered 
(Dobrovoloskaia-Zavadskaia  and  Kobozieff  1932),  and  for 
many  years  thereafter,  t  haplotypes  were  thought  to  be 
recessive  alleles  at  the  Brachyury  (T)  locus.  Although 
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there  is  a  T  locus  near  the  centromere  on  chromosome  17, 
it  is  well-defined  single  locus  which  is  only  a  small 
part  of  a  t  haplotype  (Bennett  et  al.  1975). 

The  different  t  haplotypes  all  appear  to  be  related 
to  one  another  structurally.  A  complete  t  haplotype 
encompasses  about  30  x  10 3  kb,  which  accounts  for 
approximately  1%  of  the  entire  mouse  genome  and  includes 
the  entire  murine  H-2  complex,  hence  the  connection  with 
the  polymorphism  of  the  class  II  genes.  There  are  also 
polymorphisms  within  the  t  haplotypes  themselves,  among 
the  most  t  specific  of  which  may  be  the  t  complex 
proteins  (TCP)  (Silver  et  al^  1979;  Silver  et  al^  1983). 
Also  within  the  chromosomal  region  occupied  by  t 
haplotypes  are  many  other  normal  genes  common  to  non-t 
bearing  mice,  along  with  a  smaller  number  of  mutant  t 
genes  which  must  effect  the  t  specific  characteristics. 

The  t  haplotypes  have  been  known  to  influence  tail 
length,  fertility,  embryogenesis,  male  transmission 
ratio,  and  meiotic  recombination  (Dunn  and  Gluecksohn- 
Schoenheimer  1950;  Silver  1985).  Of  these  character- 
istics, it  is  believed  that  through  suppression  of 
recombination  the  t  haplotypes  has  been  maintained  as 
a  distinct  genomic  unit.  Furthermore,  a  distorted 
male-specific  transmission  ratio  permits  propagation 
through  mouse  populations  despite  the  deleterious  effects 
which  accompany  complete  t  haplotypes. 
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The  suppression  of  recombination  which  occurs  in 
complete  t  haplotypes  with  non-t  wild  haplotypes,  as 
first  discovered  by  Dunn  and  Caspari  (1945),  is  related 
to  the  t  genomic  structure.  This  suppression  extends 
from  T,  includes  the  H^  complex  (Hammerling  and  Klein 
1975),  and  the  Tla  and  Qa-2  regions  (Shin  et  al^_   1982; 
Silver  1981),  but  not  the  Pgk-2  locus  (Nadeau  1983; 
Rudolph  and  Vanderberg  1981).  Thus,  a  complete  t 
haplotype  comsists  of  a  12  to  15  cM  region  of  the 
chromosome  with  concomitant  suppression  of  recombination 
from  somewhere  between  the  centromere  and  T  and  extending 
to  somewhere  between  the  distal  part  of  the  MHC  and 
Pgk-2.  Rare  chromosomes  which  had  recombined  within  the  t 
haplotype  were  discovered  and  designated  as  partial  t 
haplotypes.  These  rare  recombinants  were  subseguently 
found  to  be  of  critical  importance  in  understanding  the 
physical  structure  of  the  t  haplotypes  (Lyon  1960; 
Lyon  and  Meredith  1959). 

With  partial  t  haplotypes  as  a  tool  used  to  infer 
structure,  the  region  of  suppression  of  recombination  was 
found  to  occur  only  along  the  extent  of  t  DNA  present 
(Bechtol  and  Lyon  1978;  Bennett  et  al^  1979).  Normal 
recombination  rates  between  t  haplotypes  also  suggested 
that  the  structures  of  t  haplotypes  were  similar  to  one 
another,  and  different  from  the  same  chromosomal  region 
in  wild  type  DNA  (Artzt  et  al^  1982a;  Condamine  et  al. 
1983).  Artzt  et  al^_  (1982b)  were  able  to  demonstrate  that 
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the  physical  locations  of  the  H-2  and  the  tf  locus  were 
reversed  in  t  haplotypes  relative  to  their  location  in 
the  wild  type  chromosome.  These  results  were  confirmed  by 
.  others  (Shin  et  al^  1983b;  Shin  et  al^_  1984).  A  complete 
t  haplotype  therefore  consists  of  a  distal  inversion, 
which  includes  tf  and  H-2,  a  proximal  inversion  which 
includes  T  and  the  genes  encoding  the  Tcp  (T  complex 
proteins)  products  (Herrmann  et  al.  1986),  and  possibly 
a  small  central  inversion. 

Many  complete  t  haplotypes  are  known  to  have  lethal 
effects  in  homozygous  t  embryos  (Klein  et  al.  1984). 
This  can  be  a  useful  tool  as  one  of  the  few  ways  to 
distinguish  complete  t  haplotypes  from  one  another,  as 
different  chromosomes  carrying  different  lethal  mutations 
can  complement  each  other  in  genetic  tests  (Bennett  1975; 
Klein  et  ajU  1984;  Winking  and  Guenet  1978).  It  also 
became  possible  to  analyze  the  genetic  basis  for  t  lethal 
effects  with  the  finding  that  normal  crossover  occurs 
between  two  different  t  haplotypes  (Silver  and  Artzt 
1981).  The  majority  of  the  complete  t  lethal  mutations 
analyzed  appear  to  be  single-locus  mutations,  and  lethal 
mutations  of  complementing  t  haplotypes  are  not  allelic 
to  each  other  (Artzt  et  al^_  1982a).  Although  there  is 
some  evidence  for  clustering  (Artzt  1984),  the  different 
lethal  mutations  appear  to  be  distributed  over  the  entire 
length  of  complete  t  haplotypes.  Overall,  the  entire 
genetic  basis  for  the  t  lethal  mutations  seems  to  be 
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straight  forward,  but  the  molecular  mechanism  by  which 
they  effect  their  lethality  is  still  unknown. 

The  male-specific  transmission  ratio  distortion 
(TRD)  inherent  with  t  haplotypes  is  responsible  in  part 
for  propagation  of  t  haplotypes  through  the  wild  mouse 
population,  even  though  the  t  haplotypes  carry 
deleterious  genes.  Wild  males  with  a  complete  t  haplotype 
will  transmit  it  to  well  over  90%  of  their  offspring 
(Lyon  and  Meredith  1964a,  1964b).  Mice  carrying  a  single 
partial  t  haplotype  cannot  transmit  it  at  a  high  ratio, 
but  the  TRD  can  be  restored  in  males  carrying  particular 
pairs  of  partial  t  haplotypes  in  cis  or  trans 
configuration  (Silver  1985).  This  effect  was  higher  with 
certain  trans  combinations  for  a  portion  of  the  t 
haplotypes,  leading  Lyon  (1984)  to  propose  a  model  in 
which  partial  t  haplotypes  carry  different  lengths  of  t 
DNA  with  particular  sets  of  distortion  loci.  Lyon  (1984) 
hypothesized  that  a  series  of  t-specific  distorter  loci, 
Ted,  act  on  a  single  t-specific  responder  locus,  Tcr.  The 
effects  of  the  Ted  loci  would  be  additive,  and  they  could 
act  cis  or  trans  to  the  Tcr  locus  to  transmit  it  at  a 
high  ratio  when  enough  Ted  loci  are  present.  Evidence  in 
support  of  this  model  has  been  obtained  by  Fox  et  al. 
(1985).  Further  research  based  on  this  model  should  be 
forthcoming. 

Sterility  is  another  effect  which  accompanies  the 
presence  of  any  two  complete  t  haplotypes  in  male  mice. 
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The  physiological  reasons  for  this  sterility  are  still 
unknown.  The  sperm  appear  to  be  morphologically  normal 
(Hillman  and  Nadijcka  1980).  This  sterile  condition  is 
of  particular  interest  because  of  its  strong  similarities 
to  the  t's  TRD  effect.  The  possibility  exists  that 
the  two  are  related,  but  this  has  not  yet  been  proven. 

The  association  of  the  t  haplotypes  with  the  murine 
histocompatibility  class  II  molecules  has  long  fascinated 
geneticists,  due  to  the  inclusion  of  the  H-2  complex  in 
the  recombination  suppresion  of  complete  t  haplotypes. 
The  extreme  polymorphic  nature  of  the  class  II  molecules 
has  provided  excellent  markers  and  an  approach  to  study 
evolution  of  the  t  haplotypes.  Dembic  et  al.  (1984)  and 
Nizetic  et  al.  (1984)  have  drawn  correlations  between  an 
E^  deletion  and  its  association  with  t  haplotypes  which 
suggest  an  ancient  origin  for  this  deletion.  Association 
of  the  members  of  the  same  t  complementary  group  with 
the  same  H-2  haplotype  supports  this  view,  but 
interpretations  should  be  made  cautiously  as  the  evidence 
that  H-2  haplotype  association  with  t  chromosomes  is 
derived  from  a  single  ancestor  is  not  conclusive.  More 
recently,  by  Figueroa  et  al.  (1985)  have  revealed  the 
existence  of  three  major  groups  of  class  II  alleles 
associated  with  particular  t  haplotypes.  These  results 
are  not  in  conflict  with  those  to  be  presented  in  this 
dissertation. 
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Polymorphism  of  the  H-2  Class  II  Genes  in  Wild  Mice 

The  murine  class  II  histocompatibility  genes  are 
one  of  the  most  polymorphic  gene  complexes  in  mammalian 
genetics.  Research  on  the  genetic  basis  for  this 
polymorphism  and  its  functional  significance  has  led 
to  many  critical  discoveries  in  transplantation  biology, 
cancer  research,  and  genetics.  However,  most  of  the 
research  in  these  areas  has  been  carried  out  using 
standard  laboratory  mice.  As  was  mentioned  earlier, 
almost  all  of  these  mice  are  of  M.m.  domesticus  origin 
and  derived  from  a  very  limited  number  of  stocks  which 
were  not  well  characterized. 

The  presence  in  wild  mouse  populations  of  private 
H-2  antigen  specificities  absent  from  the  standard 
laboratory  inbred  mice  led  to  the  realization  that  there 
was  a  need  to  identify  and  characterize  these  new 
H-2  specificities.  The  methodology  of  choice  was  a 
serological  characterization  of  the  wild  H-2  haplotypes, 
but  the  problem  was  to  isolate  these  antigens  from 
non-H-2  antigens  so  that  antisera  specific  for  only  the 
H-2  could  be  produced.  Klein  developed  the  B10.W  congenic 
lines  (Klein  1973,  1975),  where  "W"  stands  for  wild.  The 
wild  males  were  bred  with  a  BIO. BR  female  and  the  progeny 
were  backcrossed  8  to  14  times  to  the  same  inbred  strain 
with  a  continual  selection  for  an  H-2  marker  (Ssh) 
specific  for  the  wild  mouse's  H-2  haplotype.  The  Ssh 
animals  were  then  intercrossed  and  progeny  with  Ssh  and 
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a  H-2.23  negative  phenotype  (wild  haplotype)  were 
selected  to  establish  homozygous  lines  with  brother  x 
sister  matings  to  maintain  the  line.  Thus,  each  wild 
H-2  haplotype  is  bred  onto  a  C57BL/10  background  for 
specific  analysis  of  the  wild  type  H-2. 

Once  the  B10.W  lines  were  established,  a  serological 
examination  of  sixteen  of  their  wild  H-2  haplotypes 
substantiated  their  extreme  polymorphism  (Klein  1975; 
Zaleska-Rutczynska  and  Klein  1977).  A  few  wild  haplotypes 
appeared  to  be  identical  to  one  another  serologically, 
and  a  few  of  them  resembled  standard  laboratory  strains 
of  mice,  but  most  were  different  from  one  another  and 
different  from  all  known  laboratory  inbred  mouse  strains 
(Zaleska-Rutcznska  1977).  A  serological  analysis  of  29 
wild-derived  E^J.   haplotypes  (Wakeland  and  Klein  1979a; 
Wakeland  and  Klein  1981)  defined  five  new  I  region 
antigens,  with  the  inclusion  of  three  new  haplotypes, 
u,  v,  and  j_,  on  their  inbred  panel.  Mentioned  in  this 
same  report  is  the  beginnings  of  discernible 
"phenogroups. "  Also,  wild  mouse  haplotypes  which  showed 
showed  evidence  of  possible  recombination  in  the  H-2 
complex  were  characterized  (Duncan  and  Klein  1980; 
Wakeland  and  Klein  1979b),  suggesting  that  the  wild 
mouse  haplotypes  may  be  of  use  in  analyzing  recombination 
mechanisms  as  they  occur  in  a  natural  population. 

The  combination  of  serological  and  tryptic  peptide 
mapping  analyses  proved  to  be  very  informative  for 
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Wakeland  and  Klein  (1983).  They  were  able  to  organize 
29  B10.W  lines  into  8  distinct  antigenic  groups.  The 
tryptic  peptide  mapping  correlated  with  the  serology, 
but  also  demonstrated  an  extremely  high  degree  of 
similarity  of  the  class  II  molecules  of  members  of  the 
same  family.  These  class  II  families  often  had  a  standard 
laboratory  inbred  mouse  strain  as  a  "prototypic"  member. 
The  discovery  of  the  existence  of  groupings  of  class  II 
wild  haplotypes  bears  directly  on  the  question  of  how 
the  polymorphism  of  the  class  II  molecules  arose  and  how 
it  is  maintained.  The  process  of  generation  of  diversity 
in  class  II  molecules  may  not  be  as  random  as  once 
thought.  These  aspects  are  stressed  here  because  these 
groupings  formed  the  basis  for  this  dissertation. 

Two  of  the  groupings  established  by  Wakeland  and 
Klein  (1983),  the  Ak  and  AP  families,  were  subsequently 
selected  for  more  detailed  analysis.  The  tryptic  peptide 
fingerprints  of  the  Aa,  Ag,  Ea,  and  E0  subunits  encoded 
by  four  of  the  wild  ^2   genes  in  the  Ak  group  were 
compared.  The  Aa  and  Ap  subunits  of  all  of  the  related 
haplotypes  differed  from  Aak  and  Apk  by  less  than  10%  of 
their  tryptic  peptides  (Wakeland  and  Darby  1983).  The 
tryptic  peptide  fingerprint  comparisons  of  the  E0  gene 
in  these  same  strains  were  Egd-like  in  two  wild 
haplotypes  and  Ess-like  in  another  wild  haplotype 
suggesting  that  recombination  between  A^  and  Er  may  be 
significant  in  the  wild.  This  may  reflect  different 
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evolutionary  patterns  of  the  A3  and  Aq  genes  with  respect 
to  the  Ep  genes. 

The  Ak  and  AP  families  (Wakeland  and  Darby  1983; 
Wakeland  and  Klein  1983)  were  also  analyzed  to  determine 
the  effect  of  their  minor  structural  variations  on 
allorecognition  by  T  lymphocytes  (Peck  et  al.  1983). 
Minor  structural  variations  in  the  A  molecule  were 
usually  found  to  cause  major  functional  changes  in 
in  allorecognition.  These  changes  were  always  detected 
when  the  Ap  subunit  contained  the  structural  variation. 
Peck  et  al.  (1983)  also  found  that  more  than  one  site  in 
the  A  molecule  can  be  recognized  by  alloreactive  T 
lymphocytes.  These  results  suggest  that  specific  sites 
in  the  A  molecule  are  critical  for  allorecognition.  Thus 
it  would  be  informative  to  know  the  location  of  the  minor 
structural  differences  between  wild  H-2  haplotypes  in 
either  the  Ak  or  the  AP  family  to  determine  if  the 
differences  are  in  a  critical  binding  area  of  the 
molecule.  If  so,  the  evolutionary  mechanism  generating 
the  polymorphism  found  in  one  of  these  haplotype  families 
would  seem  to  be  operating  in  a  non-random  process. 

Radiochemical  sequence  analysis  of  tryptic  peptides 
of  wild-derived  H-2  complexes  of  Ak  family  members  has 
localized  structural  variations  of  the  A  molecule  to  the 
aj  and  @x  domains  (Wakeland  et  al^_  1985).  The  variations 
have  been  localized  in  the  Aa  molecule  to  two  adjacent 
peptides.  In  the  Ap  subunit  the  differences  have  been 
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localized  to  single  amino  acid  changes,  possibly  due  to 
single  point  mutations  in  the  encoding  DNA.  Thus,  the  Ak 
family  of  class  II  alleles  probably  are  diversifying  by 
the  accumulation  of  discrete  mutations  within  the  exons 
encoding  the  a^  and  3j_  domains.  Again  this  suggests  that 
wild-derived  variants  in  exon  structure  are  not  random. 
Recent  data  on  the  DNA  structure  of  the  Ak  and  A? 
families  based  on  RFLP  analysis  (McConnell  et  al.  1986) 
suggests  that  the  intron  structure  may  also  be 
informative  to  determine  evolutionary  lineage  of  H-2 
class  II  haplotypes. 


MATERIALS  AND  METHODS 


Mice 


All  mice  were  from  the  mouse  colony  in  the  Tumor 
Biology  Unit  at  the  Department  of  Pathology,  University 
of  Florida,  or  from  our  wild  mouse  colony  at  the  Animal 
Care  Facility,  University  of  Florida.  Strains  used 
included  AZROU  1,  AZROU  2,  BELGRADE  1,  C57BL/10,  BIO. BR, 
B10.BUA16,  B10.CAA2,  B10.CAS2,  B10.CHA2,  B10.D2,  B10.F, 
B10.KEA5,  B10.M,  BIO. PL,  B10.Q,  BIO . RIII ( 7INS ) ,  B10.S, 
B10.SAA48,  B10.SM,  B10.STC77,  B10.STC90,  B10.WB,  tw71, 
TT6,  t6-JRl,  t"8,  tw75,  t^  ^5,  ^2,  JERUSALEM  3, 

JERUSALEM  4,  METKOVIC  1,  STU,  VIBORG  5,  and  W12A.  The 
inbred  mouse  strains  are  maintained  by  full  brother  x 
sister  mating  with  a  single  line  of  descent.  All  mouse 
strains  are  homozygous  at  the  H-2  complex  unless 
otherwise  noted. 

Isolation  of  Genomic  DNA 

Genomic  DNA  was  prepared  from  liver  tissue  according 
to  the  methods  of  Maniatis  et  §J_  (1982).  The  mice  were 
deprived  of  food  for  24  hours  prior  to  sacrifice.  The 
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livers  were  minced  with  surgical  scissors,  placed  in  a 
mortar  which  contained  liquid  nitrogen.  The  frozen  tissue 
pieces  were  then  ground  to  a  fine  powder  and  added  to  40 
ml  of  TES  buffer  (10  mM  Tris  HCl,  pH  7.5;  5  raM  EDTA,  100 
mM  NaCl)  with  1%  SDS  and  0.4  mg/ml  of  protease  K  (Sigma, 
St.  Louis,  MO).  This  DNA  preparation  was  then  incubated 
at  65 °C  for  16  hours,  extracted  three  times  with  Tris 
equilibrated  phenol  (pH  7.5),  twice  with  chloroform  and 
isoamyl  alcohol  (96:4  v/v)  and  then  precipated  by  the 
addition  of  2.5  volumes  of  isopropanol.  The  high 
molecular  weight  genomic  DNA  was  hooked  from  the 
isopropanol  solution  with  a  drawn  Pasteur  pipette, 
dissolved  in  0.5  ml  TE  (10  mM  Tris  HCl,  pH  7.5,  1  mM 
EDTA),  and  dialyzed  extensively  against  TE  buffer.  The 
resulting  genomic  DNA  prepartions  were  then  analyzed  for 
purity  and  quantitated  by  spectrophotometry  and  agarose 
gel  electrophoresis.  All  DNA  preparations  used  in  this 
study  have  260/280  ratios  in  excess  of  1.8  and  migrate 
as  high  molecular  weight  DNA  on  0.7%  agarose  gels. 


Restriction  Endonuclease  Digestions 
and  Agarose  Gel  Electrophoresis 


A  Tris  buffered  solution  containing  15  ug  of  genomic 
DNA  was  digested  with  30  units  of  enzyme  for  16  hours  at 
37 °C  under  conditions  described  by  the  supplier  (Bethesda 
Research  Laboratories,  Bethesda,  MD) .  An  additional  15 
units  of  enzyme  was  then  added  for  8  hours.  The 
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efficiency  of  each  endonuclease  digestion  was  monitored 
by  removing  10%  of  the  digest  reaction  volume  immediately 
following  the  final  addition  of  endonuclease  and  adding 
this  aliquot  to  0.5  ug  of  lambda  phage  DNA.  Following  an 
8  hour  incubation,  digestion  of  the  genomic  DNA  was 
analyzed  by  electrophoresis  in  0.7%  agarose  gels. 
Complete  digestion  of  the  genomic  and  lambda  phage  DNA 
was  detected  as  a  "smear"  of  genomic  DNA-derived 
restriction  fragments  together  with  a  pattern  of  lambda 
DNA  derived  restriction  fragments  characteristic  of 
complete  digestion  with  each  specific  enzyme.  The  bulk 
of  the  digested  genomic  DNA  (13.5  ug)  was  stored  at  -20 °C 
until  electrophoresis.  Digested  genomic  DNAs  were 
electrophoresed  through  0.7%  agarose  gels  for  40  hours 
at  1.5  V/cm  or  for  20  hours  at  3.0  V/cm  in  a  high 
resolution  horizontal  electrophoresis  apparatus  with 
cooling  thermoplate  (International  Biotechnologies 
Incorporated,  New  Haven,  CT) . 

Capillary  Transfer  and  Hybridization. 

Following  electrophoresis,  DNA  was  transferred  from 
the  gel  to  nylon  filters  (Zetabind,  AMF,  Meriden,  CT)  by 
the  method  of  Southern  (1980).  Transfer  efficiency  was 
monitored  by  comparing  the  amount  of  DNA  remaining  in  the 
gel  following  transfer  with  the  amount  present  prior  to 
transfer  by  ethidium  bromide  staining  and  photographic 
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analysis.  The  nylon  filters  were  vacuum  dried  for  2 
hours  at  80 °C  and  stored  on  dessicant  at  4°C  until 

hybridization.  The  filters  were  hybridized  with  a 

3  2 

J^P-labeled  5.8  kb  Eco  RI  fragment  containing  the  entire 

A(3d  gene  (Malissen  et  al^  1983)  or  with  a  1.2  kb  Hind  III 
fragment  containing  part  of  the  A^  gene  derived  from 
I-A°  (J.  Seidman,  personal  communication  1984).  The 
probes  were  radiolabeled  with  32P-dCTP  to  a  specific 
activity  of  >2  x  108  dpm/ug  by  nick  translation  (Bethesda 
Research  Laboratories,  Bethesda,  MD) .  Hybridization  and 
rehybridization  conditions  were  as  described  by  the 
supplier  of  the  Zetabind  nylon  filters  (AMF,  Meriden, 
CT).  Final  stringency  was  established  by  two  30  minute 
washes  at  65 °C  with  0.015  M  NaCl,  0.0015  M  sodium 
citrate,  0.1%  SDS.  Autoradiographs  were  produced  by 
exposure  for  2-6  days  on  XAR-5  X-ray  film  (Kodak, 
Rochester,  NY)  with  intensifying  screens  (Dupont, 
Wilmington,  DE)  at  -70 °C. 

Data  Analysis. 

RFLP  analyses  were  performed  using  equation  21  from 
Nei  and  Li  (1979)  with  F=  2nXY/(nx  +  nY)  in  which  nx  and 
nY  are  the  numbers  of  fragments  in  populations  X  and  Y, 
respectively,  whereas  nXY  is  the  number  if  fragments 
shared  by  the  two  populations.  The  validity  of  the 
formula  was  tested  by  Nei  and  Li  in  known  pairwise 
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sequence  comparisoms.  An  F  value  was  calculated  for  each 
pairwise  comparison  for  all  restriction  digests. 
Restriction  fragments  which  weakly  hybridized  with  probes 
for  either  ^  or  A^  were  not  included  in  the  analysis. 


RESULTS 


RFLP  Analysis  of  the  An   and  Ap  Genes 
of  Standard  Laboratory  Inbred  and  Wild  Mice 


Table  1  presents  the  37  mouse  strains  analyzed  in 
this  study,  including  13  standard  laboratory  inbred 
strains,  15  strains  containing  wild  derived  H-2 
haplotypes,  and  9  t  haplotype  bearing  strains.  The  28 
wild  and  standard  laboratory  inbred  mouse  haplotypes 
will  be  dealt  with  in  this  first  section  of  the  results. 
Also  relevant  to  the  data  presented  here,  mice 
representative  of  the  three  different  subspecies  were 
analyzed  in  this  study,  Mus  musculus  domesticus ,  Mus 
musculus  musculus,  and  Mus  musculus  castaneus,  as  seen 
in  table  1.  The  genomic  structures  of  the  A^  and  Aq 
alleles  of  these  haplotypes  were  compared  by  RFLP 
analysis  with  DNA  probes  specific  for  these  genes,  as 
described  in  the  materials  and  methods.  A  more  detailed 
diagram  of  the  Ap  probe  is  illustrated  in  figure  2  in  the 
literature  review.  The  probe  consists  of  a  5.4  kilobase 
(kb)  Eco  RI  genomic  fragment  derived  from  the  H-2b 
haplotype. 

In  the  initial  RFLP  analysis  of  the  Ag  gene  of  the 
28  standard  laboratory  inbred  and  wild  mouse  strains,  I 
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Table  1.  Mouse  strains  used  in  this  study. 


Mouse 

Geographic 

group 

strain 

Subspecies 

H-2 

origin 

C57BL/10 

M.m. 

domesticus 

a   b 

Old  inbred 

b 

AZROU  1 

it 

w201 

Morocco 

b 

JERUSALEM  3 
tw75 

ir 
n 

w203 

Israel 

b 
b 

JERUSALEM  4 

it 

w204 

Israel 

b 

B10.M 

'i 

f 

Old  inbred 

b 

B10.WB 

n 

3 

Old  inbred 

b 

B10.S 

ii 

s 

Old  inbred 

b 

B1Q.STC90 
tw71 

ii 

ii 

wl5 

Michigan 

b 
b 
b 

p/12 

ii 

TT6-865 

ii 

b 

TT6-866 

ii 

b 

AZROU  2 

'i 

w217 

Morocco 

b 

STU 

ii 

w34 

Eur .  inbred 

b 

W12A 

ii 

w216 

Netherlands 

b 

B10.D2 

M.m. 

domesticus 

d 

Old  inbred 

d 

B10.RIII 

it 

r 

Old  inbred 

d 

METKOVIC  1 

ii 

w205 

Yugoslavia 

d 

B10.BUA16 

ii 

w22 

Michigan 

d 

B10.CAS2 

M.m. 

castaneus 

wl7 

Thailand 

d 

t6-JRl 

M.m. 

domesticus 

d 

pD 

it 

d 

B10.SM 

it 

V 

Old  inbred 

d 

B10.F 

■I 

P 

Old  inbred 

d 

B10.Q 

ti 

q 

Old  inbred 

d 

B10.KEA5 

•i 

w5 

Michigan 

d 

B10.CAA2 

ii 

wll 

Michigan 

d 

B10.STC77 

ii 

wl4 

Michigan 

d 

BELGRADE  1 
tw8 

^W32 

M.m. 

musculus 

w202 

Yugoslavia 

d 

M.m. 

domesticus 
it 

d 
d 

B10.SAA48 

'■ 

w3 

Michigan 

d 

BIO. BR 

M.m. 

domesticus 

k 

Old  inbred 

k 

B10.CHA2 

n 

w26 

Old  inbred 

k 

BIO. PL 

■1 

u 

Old  inbred 

k 

NZW 

it 

Old  inbred 

k 

aM.m.  abbreviation 

represents 

the  speci 

.es  designation 

Mus  musculus, 
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obtained  restriction  fragments  such  as  those  illustrated 
in  figure  4.  Patterns  of  similarity  between  different 
Ap  alleles  began  to  stand  out.  Apd  and  Apw22  alleles  are 
seen  to  have  identical  Pvu  II  fragments  of  2.89  kb  and 
1.85  kb.  Apb,  Apf,  and  h$i    alleles  have  identical  4.83 
kb  and  2.92  kb  Pvu  II  RFs,  the  extra  band  in  the  Apb 
allele  has  been  proven  to  be  a  plasmid  contaminant  in 
this  particular  southern  blot.  The  ApP,  Apr,  Apv,  and 
Agw201  alleles  all  have  Pvu  II  RFs  of  2.89  kb  and  2.75 
kb. 

As  the  RFLP  patterns  became  more  obvious ,  I  arranged 
the  mouse  strains  according  to  their  similarities  and 
carried  out  more  southern  blots.  Figure  5  is  a 
representative  autoradiogram  with  the  identical  strains 
of  mice  being  analyzed,  minus  one,  as  were  seen  in  the 
Pvu  II  autoradiogram  of  figure  4,  but  reorganized  to 
emphasize  the  groupings.  The  Ap  alleles  of  r,  2'  &«  1> 
and  w22  all  have  Sac  I  RF  of  5.2,  3.8  and  2.65  kb.  The 
one  evolutionary  group  d  member  on  this  autoradiogram 
without  the  5.2  and  2.65  kb  restriction  fragment  (RF), 
although  it  does  have  the  3.8  kb  RF,  is  BELGRADE  1 
(Agw201).  When  the  two  missing  2.65  and  5.2  kb  RF  from 
the  d  group  alleles  are  added  together,  the  sum  is  7.95 
kb,  0.1  kb  different  from  the  new  7.85  kb  RF  of  the 
As     and  well  within  experimental  error  to  suggest 
that  the  new  RF  is  due  to  the  absence  of  a  single  Sac  I 


Figure  4.  Autoradiogram  of  a  southern  blot  of  the 
different  mouse  strains '  DNA  which  was  digested  with 
the  rstriction  endonuclease  Pvu  II,  electrophoresced, 
blotted,  and  probed  with  the  Aq  genomic  probe  as 
described.  Each  band  represents  the  relative  location 
of  the  restriction  fragment  in  the  gel,  which  differs 
according  to  the  position  in  the  gene  of  the  restriction 
endonuclease  used,  Pvu  II  in  this  case.  Molecular  weight 
markers  are  indicated  at  the  righht  side  of  the  figure. 
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Figure  5.  Autoradiogram  of  a  Sac  I  restriction 
endonuclease  digestion  of  standard  laboratory  inbred 
and  wild  mouse  strains'  DNA  probed  with  Ap.  This  panel 
shows  representative  members  of  the  three  evolutionary 
Aq  groups  discovered. 
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site  in  BELGRADE  1  not  present  in  the  Ap  r,  g,  d,  v, 
and  w22  alleles. 

Of  the  group  b  Ap  alleles  on  this  autoradiogram, 
the  f,  j_,  b,  and  s  alleles  have  a  Sac  I  1.58  kb  RF  in 
common.  The  f,  j_,  and  b  alleles  have  a  common  7.3  kb 
Sac  I  RF  while  the  f ,  j_,  and  s  alleles  have  a  common 
3.8  kb  Sac  I  RF.  The  k  group  members  shown  on  this 
autoradiogram  have  a  common  Ap  Sac  I  4.6  RF  kb  which  has 
been  found  only  in  k  group  members  and  not  in  any  of  the 
other  33  mouse  strains  examined. 

The  digestion  of  the  mouse  strains  with  a  total  of 
seven  RE  makes  possible  a  detailed  and  accurate  compar- 
ison between  alleles  to  better  define  the  groups. 
Consequently,  the  DNA  sequence  homology  among  these  Ap 
alleles  can  be  quantitatively  estimated  from  the  RFLP 
data  by  calculating  the  fraction  homologous  (F)  value  as 
defined  by  Nei  and  Li  (1979).  The  F  value  is  the  fraction 
of  RF's  which  the  two  alleles  have  in  common.  An  F  value 
of  1.00  indicates  all  RF's  for  all  seven  RE"s  are 
identical  for  the  two  alleles  being  compared.  An  F  value 
of  0  indicates  that  no  RF's  are  shared  between  the  two 
alleles  being  compared.  A  number  of  mouse  strains,  which 
would  always  be  in  the  same  group,  had  identical  RFLP's 
for  all  seven  restriction  enzymes,  giving  them  F  values 
of  1.00  when  compared  to  one  another.  BIO,  AZROU  1, 
JERUSALEM  3,  and  tw75  all  have  identical  Ap  alleles, 
thereby  establishing  them  as  a  core  of  the  b  group, 
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named  after  the  b  haplotype  of  the  prototypic  C57BL/10 
mouse  strain  (also  designated  BIO).  The  tw71  and  twl2 
strains  have  identical  A3  alleles,  as  do  the  two  TT6  t 
haplotype  strains,  although  these  two  may  be  identical 
chromosomes.  The  two  mouse  strains  B10.WB  and 
JERUSALEM  4  are  another  pair  with  an  F  value  of  1.00 
between  them,  with  B10.M  differing  from  them  by  a  single 
Bgl  II  fragment.  In  the  d  group,  named  after  the  well 
characterized  prototypic  d  haplotype,  the  A3  alleles  of 
B10.RIII  and  METKOVIC  1  are  identical  to  one  another, 
as  are  tw8  and  tw32.  The  mouse  strains  B10.F,  B10.Q, 
B10.CAA2,  B10.KEA5,  and  B10.STC77  all  possess  identical 
A3  alleles,  as  do  the  pair  STU  and  W12A,  and  the  k  group 
members  BIO. BR  and  B10.CHA2,  which  will  all  be  considered 
in  detail  shortly.  RFLP  analysis  with  seven  different 
restriction  endonucleases  indicated  that,  of  the  3  6 
different  mouse  strains  analyzed  including  t  haplotype 
bearing  mice,  22  different  A3  alleles  were  identified. 

Table  2  presents  a  matrix  comparison  of  the 
divergence  of  A3  within  and  between  representative 
members  of  groups  b,  d,  and  k  expressed  as  F  values,  and 
based  on  results  obtained  with  seven  RE's.  The  full 
strain  designations  can  be  found  in  the  Table  1.  Once 
the  F  values  for  the  A3  alleles  of  the  3  7  standard 
laboratory  inbred  strains,  wild  derived  strains,  and  t 
haplotype  bearing  strains  were  calculated  and  compared, 
the  existence  of  three  groups  became  obvious.  The  F 
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values  within  any  one  of  the  three  groups  is  usually 
greater  than  0.50,  and  most  often  be  0.67  or  higher.  The 
discrete  nature  of  the  groups  is  demonstrated  by  the  F 
values  between  the  groups,  ranging  from  zero  to  0.18, 
indicating  very  little  homology. 

Table  1  contains  the  complete  listing  of  the  37 
strains  analyzed  with  the  Ag  group  to  which  they  have 
been  assigned  shown  in  the  far  right  column.  Every  mouse 
strain  for  which  RFLP  data  was  obtained  using  seven  re's, 
can  be  placed  into  one  of  these  three  groups.  Also 
presented  in  table  1  is  the  subspecies  listing  of  each 
of  the  strains  tested.  There  is  no  correlation  of  the 
subspecies  of  the  mice  to  the  Ap  group  to  which  they 
belong.  For  example,  the  subspecies  Mus  musculus 
domesticus  is  present  in  all  three  groups,  and  three 
different  subspecies  are  present  in  the  d  group, 
indicating  that  the  existence  of  these  Ag  groups  predated 
the  subspeciation  of  Mus  musculus  at  least  into  the  three 
subspecies  represented  here.  Subspeciation  probably 
occurred  approximately  one  million  years  ago.  The  ancient 
origin  of  these  evolutionary  A3  groups  indicates  a 
continual  maintenance  of  at  least  part  of  the  A0  gene 
structure . 

To  further  substantiate  the  existence  and  the 
discrete  nature  of  the  evolutionary  Ao  groups,  a 
statistical  comparison  of  the  AQ  groups  is  shown  in 
table  3 .  The  average  F  values  between  any  two  members 
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Table  3.  Statistical  comparison  of  Aq  groups 
defined  by  RFLP  analysis. 


Mean  F 

value  +  S.D. 

d.f  .a 

Group 

Within 
same  group 

Between 
different  groups 

Students 
T  test 

1 

.641  +  .157 

.098  +  .074 

466 

50.13 

2 

.644  +  .161 

.103  +  .088 

511 

48.94 

3 

.697  +  .158 

.147  +  .091 

140 

13.99 

adegrees  of  freedom 
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within  the  same  group  is  no  lower  than  .641,  while  the 
F  values  between  different  groups  is  no  higher  than  .091. 
The  Student's  T  test  number  for  each  of  the  three  groups 
calculates  to  a  probability  of  p  <  0  .001,  indicating  the 
statistical  validity  of  these  three  evolutionary 
groupings . 

The  genetic  diversity  of  Ag  in  35  of  the  36  strains 
of  mice  was  analyzed  by  RFLP  using  a  1.2  kb  Hind  III 
restriction  fragment  derived  from  AqP   (J.  Seidman, 
personal  communication) .  This  fragment  contains  part  of 
the  exon  which  encodes  the  a^_  domain  of  ^  together  with 
approximately  one  kb  of  flanking  intron  seguences 
(Mclndoe  and  Wakeland,  unpublished  observation). 
Representative  examples  of  the  restriction  fragment 
patterns  detected  at  high  stringency  with  this  A^   probe 
are  shown  in  figure  6.  Digestion  with  the  restriction 
endonuclease  Eco  RI  resulted  in  a  10.6  kb  ^  fragment  in 
all  of  the  35  mouse  strains  examined  without  exception. 
Upon  the  digestion  with  the  other  six  restriction 
endonucleases  used  in  this  study,  however,  a  significant 
amount  of  restriction  fragment  polymorphisms  were  noted. 
Of  the  35  mouse  strains  assessed  for  Aa,  27  separate 
alleles  are  discernible  by  the  RFLP  analysis,  indicating 
the  polymorphic  nature  of  the  ^  gene. 

Table  4  presents  a  matrix  comparison  of  the 
calculated  F  values  based  on  the  restriction  fragments 
obtained  with  the  total  of  seven  restriction 


Figure  6.  Autoradiogram  of  an  Eco  RI  restriction 
endonuclease  digestion  of  wild  and  standard  laboratory 
inbred  strains  of  mice,  probed  with  a  1.2  kb  Hind  III 
fragment  from  A^  and  kindly  supplied  by  Dr.  John 
Seidman.  As  with  the  A3  probe,  this  A^  probe  is  also 
derived  from  a  genomic  clone  and  contains  predominantly 
noncoding  sequence. 
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endonucleases.  The  mouse  strains  in  this  K.   matrix  are 
the  same  strains  listed  in  table  2  and  represent  the 
three  evolutionary  Ap  groups.  Within  each  of  the 
three  Ap  groups,  the  A^  gene  shows  lower  F  values, 
indicating  a  lesser  degree  of  homology  between  members 
of  the  same  Ap  group.  More  significantly,  the  F  values 
of  the  Aq  alleles  when  comparing  between  the  different 
Ap  groups  are  much  higher  than  for  the  Ap  alleles, 
ranging  very  close  to  the  same  F  values  seen  within  an 
Ap  group.  Thus  no  groups  are  detected  in  this  RFLP 
analysis  of  the  A^  gene  of  3  5  different  strains  of  mice. 

RFLP  Analysis  of  t  Haolotype-Bearing  Mice 

A  number  of  t  haplotype  bearing  mice  have  also  been 
examined  in  this  protocol  because  there  exists  some 
controversy  regarding  their  evolutionary  origin.  In 
an  attempt  to  address  this  aspect,  nine  different  t 
haplotype  strains  were  examined  by  RFLP  analysis  in  a 
collaborative  study  with  Dr.  Joesph  Nadeau  of  Jackson 
Laboratories.  Eight  different  t  haplotype  bearing  mice, 
plus  two  k  haplotype  controls,  were  examined  with  the 
same  seven  restriction  endonuclease  RFLP  analysis  as  with 
the  other  28  mouse  strains  examined. 

Figure  7  shows  a  representative  autoradiogram  of  an 
Ap  probed  Sac  I  restriction  endonuclease  digestion  of 
the  nine  t  haplotype  bearing  mouse  strains  as  listed  in 


Figure  7.  Autoradiogram  of  a  Sac  I  restriction 
endonuclease  digestion  of  t  haplotype  bearing  mice 
strains,  probed  with  the  A3  probe.  Many  of  the  strains 
are  heterozygous  with  either  the  k  haplotype  or  the  tb 
haplotype,  as  described  in  the  text,  explaining  the 
overabundance  of  Aq  restriction  fragments  seen. 
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the  figure  and  in  table  1.  The  t  haplotype  strains  are 
often  maintained  as  heterozygous  for  the  t  haplotype 
due  to  some  of  their  homozygous  lethal  effects.  This  may 
explain  the  multitude  of  restriction  fragments  seen  in 
this  figure.  To  compensate  for  this,  the  k  restriction 
fragments  (7.85,  4.6,  and  2.0  kb)  have  subtracted  from 
t  haplotypes  tw71,  tw75,  twl2,  tw5,  and  tw32.  In 
addition,  the  TT6  strains,  which  may  have  identical 
chromosomes,  are  heterozygous  with  t^-JRl,  so  those 
restriction  fragments  have  been  subtracted  from  TT6  for 
each  restriction  endonuclease  in  order  to  calculate 
relevant  F  values.  The  7.3  Ap  Sac  I  restriction  fragment 
is  present  in  strains  tw71,  TT6,  tw75,  and  twl2.  The  5.2 
Ap  Sac  I  fragment  is  found  in  strains  TT6,  t 6 ,  tw8 ,  tw^ , 
and  tw32.  The  3.6  kb  fragment  is  present  in  all  eight  of 
the  t  haplotype  strains  except  for  tw75,  which  is 
identical  to  C57BL/10  for  all  seven  restriction 
endonucleases.  Finally,  the  1.58  kb  Sac  I  fragment  as 
shown  in  figure  7  is  present  in  strains  tw7^,  TT6,  tw7^, 
and  twl2,  all  of  which  are  in  the  b  group. 

On  seeing  the  various  Ap  restriction  fragments  in 
the  t  haplotype  strains  which  were  in  common  with  members 
of  the  three  Aq  evolutionary  groups,  the  t  haplotype 
strains  were  compared  to  the  RFLP's  of  the  other  28  mouse 
strains  analyzed.  All  nine  of  the  t  haplotype  strains  Ar 
alleles  examined  by  RFLP  analysis  fit  into  either  the  b 
or  the  d  groups,  as  listed  in  table  1  and  shown  with 
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representative  strains  in  table  5.  The  F  values  for  their 
Ag  alleles  range  from  0.50  to  1.00  within  a  group,  and 
range  from  0  to  0.17  between  groups  in  table  5.  Again,  as 
with  some  of  the  other  wild  and  standard  laboratory 
inbred  strains,  a  few  are  identical  to  one  another  such 
as  twl2  and  tw71,  and  tw8  and  tw32.  Not  shown  in  table  5 
are  the  F  values  comparing  the  t  haplotype  strains  in  the 
b  and  d  groups  to  members  in  the  k  group.  The  F  values  in 
this  comparison  ranged  from  0  to  0.17,  therefore  none  of 
the  t  haplotype  strains  examined  are  members  of  the  k 
group. 

Table  6  presents  a  statistical  comparison  of  the 
Ag  alleles  of  the  t  haplotype  strains  to  determine  if 
they  are  more  related  to  one  another  within  either  of 
the  Ag  evolutionary  groups,  or  if  they  are  just  as 
related  to  any  other  member  of  their  respective  Ag  group. 
As  was  just  mentioned,  there  is  no  question  that  they 
belong  to  either  group.  The  Student's  T  test  values  shown 
in  table  6  indicate  that  in  neither  group  are  the  Ag 
alleles  of  the  t  haplotypes  more  related  to  one  another 
than  to  other  members  of  the  same  Ag  group  at  the 
p  <  0.001  level  of  significance,  particularly  in  the  d 
Ag  evolutionary  group. 

The  Aq^  alleles  of  the  t  haplotype  strains  were  also 
examined  by  RFLP  analysis  with  the  same  seven  restriction 
endonucleases.  Figure  8  shows  representative  Bgl  II 
restriction  fragments  seen  in  the  t  haplotype  strains 
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Table  6.  Statistical  comparison  of  Ag  alleles  of 
t  haplotype  mice  within  related  t  haplotype  groupings 
and  between  t  haplotype  groupings  compared  to 
their  related  evolutionary  Ag  grouping. 


Mean  F 

va 

lue  +  S.D. 

d.f  .a 

Group 

Within 
Ag  t  group 

Between 
t  and  Ag  groups 

Students 
T  test 

b 

d 

.764  +  .230 
.733  +  .179 

.614  +  .137 
.606  +  .136 

63 
60 

2.85 
2.11 

adegrees  of  freedom 


Figure  8.  Autoradiogram  of  a  Bgl  II  restriction 
endonuclease  digestion  of  the  t  haplotype  bearing 
strains  of  mice,  probed  with  the  A^  fragment. 
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with  the  A^  probe  which  has  already  been  described. 
Although  A(j  is  a  polymorphic  gene,  no  grouping  patterns 
were  detected  by  RFLP  analysis  as  is  readily  demonstrated 
in  figure  8. 


RFLP  Analysis  of  the  Divergence 
of  the~A„  and  Ap  Genes  Within  the  AH  Family 


I  have  compared  the  organization  and  structures  of 
the  Aa  and  Ag  alleles  present  in  the  A_P  family  members 
by  RFLP  analysis  with  DNA  probes  specific  for  each  gene. 
The  AP  family  consists  of  6  M.m.  domesticus  H-2 
haplotypes  and  1  M.m.  castaneus  H-2  haplotype  derived 
from  Asian  and  North  American  wild  mouse  populations 
(Wakeland  and  Klein  1983).  Their  grouping  of  the  7  mouse 
strains  into  the  A?  family  is  based  on  similarities 
in  the  antigenic  phenotypes  of  the  respective  class  II 
molecules.  High  Pressure  Liguid  Chromotography  (HPLC) 
tryptic  peptide  fingerprint  comparisons  of  the  Aa  and 
Ap  subunits  of  the  A.P  family  members  have  demonstrated 
close  structural  relationships  of  the  I-A  molecules 
encoded  by  alleles  in  the  AP  family  (McConnell  et  al. 
1986).  A  gradation  in  the  relatedness  of  the  7  I-A 
molecules  became  apparent  from  this  tryptic  peptide 
fingerprint  data. 

Repesentative  examples  of  the  restriction  fragment 
patterns  detected  by  probing  at  high  stringency  with  Aq 
are  presented  in  figure  9.  Digestion  with  endonuclease 


Figure  9.  Autoradiogram  of  an  Eco  RI  and  a  Bam  HI 
restriction  endonuclease  digestion  of  the  seven 
members  of  the  I-aP  family,  probed  with  A3, 
demonstrating  the  RFLP  relatedness  of  the  five 
core  strains. 
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Eco  RI  yields  2  different  sizes  of  restriction  fragments 
for  A|3  among  the  7  A_P  family  alleles.  Asw3  is  present 
on  a  6.2  kb  fragment  while  the  remainder  of  the  AP 
family  alleles  are  present  on  5.8  kb  fragments  (Figure  9, 
left).  Similarly,  digestion  with  endonuclease  Bam  HI 
yields  two  patterns  of  restriction  fragments  within  the 
AP  family  (Figure  9,  right).  Digestion  of  AqP,  A^, 
^(3W5'  t#*14,    and  A^11  with  Bam  HI  yields  3 . 6  kb  and  5.4 
kb  fragments  while  digestion  of  Apw3  and  Ao17  yields  a 
single  9.0  kb  fragment.  The  results  with  Bam  HI  are 
consistent  with  the  loss  of  a  single  Bam  HI  site  within 
the  Apw3  and  Aqw11   alleles  during  their  evolutionary 
divergence  from  the  rest  of  the  AP  family.  The  DNA 
sequence  homology  among  these  Ag  alleles,  as  before,  is 
quantitated  from  the  RFLP  data  by  calculating  the 
fraction  homologous  (F)  value.  Table  7  presents  a  matrix 
comparison  of  the  divergence  of  Ag  within  the  AP  family 
based  on  results  obtained  with  7  restriction  enzymes.  The 
A{3P'  A0w11,  A3w5,  and  AQw14  alleles  are  indistinguishable 
by  RFLP,  consistent  with  the  similar  structures  of  the  A3 
subunits  they  encode.  Although  A(jw3  differs  from  AqP  by 
only  6  peptides,  A0w3  and  A0P  differ  by  60%  of  their 
restriction  fragments.  In  contrast,  A^17,  which  encodes 
an  Ap  subunit  differing  from  A0P  by  only  25%  of 
restriction  fragments  compared. 

As  with  the  A3  gene,  the  genetic  diversity  of  the  A„ 
gene  within  the  AP  family  was  assesed  by  RFLP  analysis. 
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Table  7.  RFLP  analysis  of  the  AP  family  K,   alleles 


Allele     p      q     wll     w5     wl4     wl7     w3 


P        -  1.00a    .95     .95      .95  .76  1.00 

q  22/22b    -      .95     .95      .95  .76  1.00 

wll  20/21  20/21     -  1.00  1.00  .80  .95 

w5  20/21  20/21  20/20     -  1.00  .80  .95 

wl4  20/21  20/21  21/21  21/21      -  .80  .95 

wl7  16/21  16/21  16/20  16/20  16/20  -  .76 

w3  22/22  22/22  20/21  20/21  20/21  16/21 

*  The  upper  right  half  of  each  checkerboard  lists  the 
fraction  homologous  (F)  value  as  defined  by  Nei  and 
Li  (1979). 

Number  of  shared  restriction  fragments/total  restric- 
tion fragments  scored  for  both  alleles.  This  analysis 
is  based  on  restriction  digestions  with  Bam  HI, 
Bgl  II,  Eco  RI,  Hind  III,  Pst  I,  Pvu  II,  and  Sac  I. 

The  genomic  DNA  from  wl7  could  not  be  digested  to 
completion  with  Hind  III  and  consequently  results  with 
this  enzyme  for  wl7  were  not  included  in  this 
analysis. 
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A  1.2  kb  Hind  III  genomic  restriction  fragment  derived 
from  A^   was  used  to  probe  the  A.P  family  digested  with 
the  same  7  restriction  endonucleases  used  for  the  AB 
analysis.  Representative  examples  of  restriction 
fragment  patterns  detected  at  high  stringency  with  this 
probe  for  A^  are  shown  in  Figure  10.  Digestion  with 
restriction  endonuclease  Eco  RI  yielded  a  single  10.6 
fragment  containing  A^  for  every  allele  in  the  AP 
family  (Figure  10,  left  side).  Similarly,  digestion  with 
Bam  HI  yielded  a  5.2  kb  Aq  fragment  with  every  AP 
family  allele  except  A^17  which  yielded  a  5.4  kb 
fragment  (Figure  10,  right  side). 

Table  8  presents  a  matrix  comparison  of  the 
calculated  F  values  between  the  various  A„  alleles 
present  within  the  A_P  family.  The  evolutionary  divergence 
of  these  alleles  coincides  closely  with  that  detected  for 
A0.  The  A^P  and  A^  alleles  are  indistinguishable  with 
the  7  restriction  enzymes  used  and  can  be  distinguished 
from  A^5,  A^11,  and  A^14  by  a  single  restriction 
fragment,  indicating  that  their  DNA  seguences  are  very 
homologous.  These  results  are  consistent  with  the 
structural  similarity  of  the  Aa  subunits  encoded  by  the 
gene  (McConnell  et  al^  1986),  and  correlate  precisely 
with  RFLP  results  obtained  with  A0  (see  Table  7). 
Although  by  RFLP  analysis  A^  is  not  organized  into 
evolutionary  groups  as  is  A3,  the  restriction  fragment 
genotypes  of  A^  and  ^wl7  were  distinguishable  from 


Figure  10.  Autoradiogram  of  an  Eco  RI  and  Bam  HI 
restriction  endonuclease  digestion  of  the  seven 
members  of  the  I-AP  family,  probed  with  A^.  The 
10.6  Eco  RI  fragment  is  present  in  all  the  mouse 
strains  tested. 
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Table  8.  RFLP  analysis  of  the  A?  family  A3  allele 


Allele     p      q     wll     w5     wl4     wl7     w3 


p        -  1.00a  1.00  1.00  1.00  .80  .43 

q  24/24b    -  1.00  1.00  1.00  .80  .43 

wll  24/24  24/24     -  1.00  1.00  .80  .43 

w5  24/24  24/24  24/24     -  1.00  .80  .43 

wl4  24/24  24/24  24/24  24/24      -  .80  .43 

wl7  16/20  16/20  16/20  16/20  16/20      -  .53 

w3  10/23  10/23  10/23  10/23  10/23  10/19 

*  The  upper  right  half  of  each  checkerboard  lists  the 
fraction  homologous  (F)  value  as  defined  by  Nei  and 
Lei  (1979). 

b  Number  of  shared  restriction  fragments/total  restric- 
tion fragments  scored  for  both  alleles.  This  analysis 
is  based  on  restriction  digestions  with  Bam  HI, 
Bgl  II,  Eco  RI,  Hind  III,  Pst  I,  Pvu  II,  and  Sac  I. 

c  The  genomic  DNA  from  wl7  could  not  be  digested  to 

completion  with  Hind  III  and  consequently  results  with 
this  enzyme  for  wl7  were  not  included  in  the  analysis. 
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each  other  and  from  the  other  5  members  of  the  AP 
family.  As  with  Ap,  the  structural  relationships  of  Aaw3 
and  Aaw17  subunits  with  those  of  the  others  of  the  AP 
family  did  not  correlate  precisely  with  the  evolutionary 
relationships  of  their  RF  genotypes.  As  before,  Aaw3  was 
less  related  than  ha111   to  the  rest  of  the  AP  family, 
although  A^3  is  more  similar  to  A^P  than  Aaw17  by 
tryptic  peptide  fingerprinting  (McConnell  et  al.  1986). 


RFLP  Analysis  of  the  Divergence 
of  the  A^  and  A^  Genes  Within  the  jdl  Family 


The  Ak  family  contains  5  M.m.  domesticus  H-2 
haplotypes  which  were  derived  from  either  European  or 
North  American  wild  mouse  populations  (Wakeland  and  Darby 
1983).  The  antigenic  phenotypes  expressed  by  alleles  in 
the  Ak  family  are  very  similar,  but  at  least  3  minor 
variants  of  the  Ak  molecule  are  present  (Wakeland  and 
Darby  1983).  Tryptic  peptide  fingerprinting  and 
radiochemical  seguencing  studies  have  demonstrated  that 
these  3  forms  of  the  Ak  molecule  differ  by  only  2  or  3 
amino  acid  substitutions  in  the  a±   and  3i  protein  domains 
(Wakeland  et  aJU  1985). 

Examples  of  the  restriction  fragment  patterns 
detected  by  probing  at  high  stringency  with  Ao  are 
presented  in  Figure  11.  Digestion  with  restriction 
endonuclease  Sac  I  yields  3  distinct  restriction  fragment 
patterns  for  Ap  among  the  5  Ak  family  alleles.  A3k  and 


Figure  11.  Autoradiograra  of  a  Sac  I  and  Eco  RI 
restriction  endonuclease  digestion  of  the  I-Ak 
family  probed  with  Aq. 
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Aqw26  are  detected  on  7.85,  4.6,  and  2.0  kb  fragments, 

^w216  and  aqw34  are  detected  on  5.2,  3.8,  2. 3, and  1.58 

kb  fragments,  and  Apw15  is  detected  on  7.3,  3.5,  and  1.58 

kb  fragments.  Apk  and  Apw26  are  indistinguishable  and 

share  no  restriction  fragments  with  Apw15,  Arw34,  and 

A,3w216.  Similarly,  A^216   and  Apw34  are  indistinguishable 

with  Sac  I  and  share  only  a  1.58  kb  fragment  Aow15 

(Figure  11,  left  side).  Results  obtained  following 

digestion  with  restriction  endonuclease  Eco  RI  are 

similar  except  that  with  this  enzyme  A$w15 ,  Aaw34,  and 

AgW216  are  indistinguishable  (Figure  11,  right  side). 

Aj3k  and  Agw26  are  detected  on  a  16  kb  fragment  while 

-V15'  V34'  and  Apw216  are  detected  on  a  6 . 4  kb 
fragment. 

Table  9  presents  a  pairwise  comparison  of  the 
homology  of  Ap  within  the  Ak  family  based  on  RFLP  results 
obtained  with  7  restriction  enzymes.  These  results 
clearly  divide  the  A3  alleles  in  the  Ak  family  into  two 
distinct  groups  based  on  homology  of  their  genomic 
structures.  I  found  that  these  two  A^k  groups  are  members 
of  A3  evolutionary  groups  b  and  d,  and  share  less 
than  15%  of  their  restriction  endonuclease  fragments. 
There  is  also  some  heterogeneity  of  Ap  within  the  Aob 
evolutionary  group  as  A0w34  and  A$w216 ,  which  were 
derived  from  European  wild  mice,  can  be  distinguished 
from  the  North  American  derived  Asw15  allele  by 
variations  in  4  restriction  fragments. 
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Table  9.  RFLP  analysis  of  the  Ak  family  AB  alleles 


Group 
Designation  Alleles    )<     w26     w216     wl5    w34 


k         k        -     1.00a      .09      .09     .09 
k        w26     22/22b    -       .09      .09     .09 

b       w216      2/23    2/23      -       .58    1.00 

b        wl5       2/23    2/23     14/24      -      .58 

b        w34       2/23    2/23     24/24    14/24 

3  The  upper  right  half  of  each  checkerboard  lists  the 
fraction  homolougous  (F)  value  as  defined  by  Nei  and 
Li  (1979). 

Number  of  shared  restriction  fragments/total  restric- 
tion fragments  scored  for  both  alleles.  This  analysis 
is  based  on  restriction  digests  with  Bam  HI,  Bgl  II 
Eco  RI,  Hind  III,  Psy  I,  Pvu  II,  and  Sac  I. 
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Figure  12  presents  representative  examples  of  the 
restriction  fragment  patterns  detected  at  high  stringency 
with  the  1.2  kb  Hind  III  fragment  probe  for  Aa.  Digestion 
with  restriction  endonuclease  Hind  III  yielded  3  distinct 
restriction  fragment  patterns  among  the  5  Ak  family 
alleles.  A^  and  Aaw26  are  detected  on  9.4  kb  fragments, 
Aaw34  and  A^216  are  detected  on  >20  kb  fragments,  and 
Aaw15  is  detected  on  a  12.5  kb  fragment  (Figure  12,  right 
side).  Aq/1  and  A^26  are  indistinguishable  and  share  no 
restriction  fragments  with  A^34,  A^w216 ,  or  Aaw15;  and 
Aaw15  is  distinguishable  from  Aaw34  and  Aaw216.  Digestion 
with  restriction  endonuclease  Pvu  II  yielded  2  distinct 
restriction  fragment  patterns  in  which  A^  and  Aaw26  are 
detected  on  3.3  kb  fragments  while  A^34 ,  A^216,  and 
Aaw15  are  detected  on  3.8  kb  fragments  (Figure  12,  left 
side).  The  A^  and  A^26  alleles  are  again  closely 
related  to  one  another  and  different  from  the  other 
alleles  which  are  all  related  on  the  protein  level. 

A  pairwise  comparison  of  the  homology  of  A^  within 
the  I-Ak  family  is  presented  in  Table  10.  This  analysis 
clearly  divides  the  A^  alleles  in  the  Ak  family  into 
two  groups  based  on  the  homology  of  their  gene 
structures.  These  two  groups  coincide  precisely  with 
the  A,3  evolutionary  groups  that  the  Ag^  alleles  are 
members  of.  As  with  the  Ag  gene,  some  additional  genetic 
diversification  of  the  A^  gene  was  detected  in  the  A^ 
alleles  within  each  group.  A^  can  be  distinguished  from 


Figure  12.  Autoradiogram  of  a  Pvu  II  and  a  Hind  III 
restriction  endonuclease  digestion  of  the  I-Ak  family, 
probed  with  Aa. 
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Table  10.  RFLP  analysis  of  the  Ak  family  A~  alleles. 


Group 
Designation 

Alleles 

k            w26 

w216 

wl5 

w34 

k 

k 

.90a 

.63 

.38 

.63 

k 

w26 

18/20b 

.74 

.48 

.74 

b 

w216 

12/19   14/19 

- 

.50 

1.00 

b 

w!5 

8/21   10/21 

10/20 

- 

.50 

b 

5 3 

w34 

12/19   14/19 

18/18 

10/20 

- 

fraction  homologous  (F)  value  as  defined  by  Nei  and 
Li  (1979). 

Number  of  shared  restriction  fragments/total  restric- 
tion fragments  scored  for  both  alleles.  This  analysis 
is  based  on  restriction  digests  with  Bam  HI,  Bgl  II, 
Eco  RI,  Hind  III,  Pst  I,  Pvu  II,  and  Sac  I. 
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^w26  by  one  restriction  fragment  and  Aaw15  can  be 
distinguished  from  Aaw216  and  Aaw34  by  three  restriction 
fragments.  This  pattern  of  diversification  also  coincides 
with  that  seen  for  Ap,  except  that  Apw26  and  Aok  could 
not  be  distinguished  with  the  7  restriction  endonucleases 
used  in  this  analysis. 


DISCUSSION 


RFLP  Analysis  of  the  Genomic  Structures 
of  the  Murine  Class  II  Histocompatibility  Alleles 


Comparisons  of  the  RF  genotypes  of  Ar  and  A^ 
alleles  provide  information  on  DNA  sequence  variations 
throughout  the  interval  of  genomic  DNA  containing  A^  or 
Ap,  including  sequences  contained  in  exons,  introns,  and 
flanking  regions.  The  size  of  the  interval  of  genomic 
DNA  analyzed  with  each  probe  can  be  estimated  by 
calculating  the  average  size  of  the  sum  of  all  the 
restriction  fragments  detected  with  each  restriction 
endonuclease.  For  the  7  restriction  ennzymes  used  in 
this  analysis,  the  Ag  probe  hybridized  to  an  average  of 
9.4  kb  of  genomic  DNA  per  restriction  enzyme  digest  and 
the  A^  probe  hybridized  to  an  average  of  7.2  kb. 
Therefore,  the  polymorphic  restriction  enzyme  sites 
assayed  in  this  study  are  distributed  over  a  fairly  large 
segment  of  genomic  DNA.  Because  both  A^  and  Ao  are  each 
encoded  by  about  700  base  pairs  of  exon  DNA,  the  majority 
of  the  genomic  DNA  assayed  by  this  RFLP  analysis  is  from 
intron  and  flanking  regions.  Thus,  although  RF  genotypes 
reflect  sequence  variations  in  the  entire  segment  of 
genomic  DNA  containing  the  assayed  gene,  the  majority  of 
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the  restriction  sites  detected  reflect  DNA  sequence 
variations  in  non-coding  regions. 

Standard  Laboratory  Inbred  and  Wild  Mice 

As  was  stressed  in  the  section  on  polymorphism  in 
the  literature  review,  the  use  of  wild  H-2  haplotypes 
has  been  critical  to  this  study.  With  the  use  of  mice  of 
known  subspecies  origin  and  known  geographic  origin,  more 
accurate  interpretations  about  the  generation  of 
polymorphism  of  the  A^  and  Ag  genes  can  be  made  from  the 
data.  The  role  of  natural  selection  and  evolution  in  the 
generation  of  polymorphism  are  essentially  unknowns  in 
the  standard  laboratory  mouse  strains  when  compared  to 
the  wild  mice.  This  advantage  of  wild  mice  in  analyzing 
the  polymorphism  of  class  II  genes,  combined  with  the 
discovery  and  characterization  of  wild-derived  H-2 
haplotypes  into  distinct  antigenic  families  (Wakeland 
and  Klein  1977;  Wakeland  and  Klein  1983),  have  laid  the 
groundwork  for  this  dissertation. 

The  36  mouse  strains  listed  in  table  1  all  fit  into 
one  of  the  three  Ap  groups,  as  shown  in  the  right  hand 
column  of  table  1.  The  fact  that  Mus  musculus  domesticus 
is  present  in  all  three  evolutionary  groups,  as  well  as 
the  fact  that  three  different  subspecies  are  present  in 
one  of  the  groups,  indicates  that  these  three  As  groups 
were  present  as  discrete  groups  prior  to  subspeciation 
of  Mus  musculus.  Therefore  there  must  be  some  selective 
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pressures  to  maintain  the  genomic  structure  as  detected 
by  the  RFLP  analysis,  even  through  the  evolutionary 
process  of  subspeciation.  As  previously  mentioned,  the 
Aq  and  Aq  probes  used  in  this  study  are  both  derived 
from  genomic  clones  and  contain  predominantly  noncoding 
sequence.  It  is  very  probable  that  whatever  the  genetic 
element  that  has  maintained  these  evolutionary  groups  is, 
it  is  probably  located  in  an  intron  somewhere  in  the  A3 
gene.  The  striking  difference  between  the  protein  and 
genomic  structures  of  the  A_P  and  the  Ak  families  provide 
particularly  strong  evidence  for  an  intron  element  being 
fundamental  to  the  difference  between  evolutionary  Ag 
groups . 

Definition  of  Evolutionary  Groups  by  RFLP  Analysis 

The  distinct  grouping  of  the  AB  alleles  is 
demonstrated  by  the  representative  mouse  strains  and 
their  values  as  shown  in  table  2 .  Because  of  the  number 
of  mouse  strains  analyzed,  they  could  not  all  be 
electrophoresed  and  southern  blotted  onto  the  same  sheet 
of  nylon  membrane,  therefore  each  strain  was  compared 
with  every  other  strain  on  at  least  three  and  on  as  many 
as  five  different  autoradiograms  of  southern  blots.  In 
this  manner  every  restriction  fragment  detected  was 
electrophoresed  on  the  same  autoradiogram  with  any  other 
restriction  fragment  that  was  close  to  the  same  size. 
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The  significance  of  the  existence  of  the  three 
A0  groups  relates  directly  to  the  evolution  of  the 
subspecies  because  of  the  presence  of  Man,  domesticus, 
M.m.  castaneus,  and  M.m.  musculus  in  one  group.  Thus 
these  evolutionary  groups  have  been  in  existence  and 
must  have  been  maintained  for  over  one  million  years, 
when  subspeciation  is  estimated  to  have  occurred.  These 
results  indicate  that  A3  is  evolutionary  diversified  as 
a  limited  number  of  discrete  allelic  forms  rather  than 
as  a  random  array  of  genetic  variants.  Choi  et  al.  (1983) 
sequenced  exon  portions  of  genomic  clones,  and  determined 
overall  genomic  structural  features  including  noncoding 
regions,  from  three  different  haplotypes,  the  b,  d,  and 
k  haplotypes,  and  concluded  from  the  data  that  the 
generation  of  polymorphism  has  been  a  random  evolutionary 
process,  most  probably  through  multiple  independent 
mutational  events.  Because  they  analyzed  only  three 
A3  alleles,  their  interpretation  was  overstated,  in 
part  because  the  three  alleles  they  chose  to  analyze 
happened  to  be  the  three  prototypic  A3  alleles  for  the 
three  distinct  evolutionary  groups,  as  the  data  presented 
in  this  dissertation  demonstrates.  Therefore,  the  genetic 
mechanisms  which  generated  the  A3  polymorphism  are  most 
probably  something  other  than  single  random  mutational 
events,  and  are  more  likely  either  gene  conversion  or 
some  form  of  intragenic  recombination. 
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Evidence  for  gene  conversion  in  Ao  has  been 
presented  (Mengle-Gaw  et  al.  1984),  but  because  of  the 
widespread  patterns  detected  for  all  of  the  Aq  alleles 
analyzed  in  this  study,  the  conversion  or  recombination 
event  may  be  more  ancient  and  established  in  the  mouse 
haplotypes  than  previously  suspected.  The  variety  of 
postulated  genetic  mechanisms  for  generation  of  class 
II  polymorphism  are  discussed  in  the  Literature  Review 
of  this  dissertation.  An  intragenic,  which  might  also 
be  termed  an  interallelic,  recombination  mechanism  at  an 
evolutionary  stage  preceding  subspeciation  would  explain 
the  discrete  grouping  of  the  Ap  alleles,  which  might 
then  diversify  more  or  less  within  the  respective  Ao 
evolutionary  group.  Data  lending  itself  towards  such  a 
mechanism  has  been  presented  by  Wakeland  and  Darby 
(1983).  The  presence  of  RFLP  identical  alleles,  along 
with  related  but  not  identical  alleles  within  each 
evolutionary  group,  would  be  explained  by  such  a 
mechanism. 

The  A^  gene  has  also  been  examined  in  detail  at  the 
DNA  seguence  level.  Benoist  et  al^_  (1983b)  examined  six 
different  standard  laboratory  inbred  A^  alleles  and  found 
a  high  degree  of  polymorphism  in  the  A^  gene,  which  is 
substaniated  in  the  results  of  this  dissertation,  and 
found  hypervariable  clustering  of  amino  acid  encoding 
nucleotides  in  the  exon  encoding  the  first  protein 
domain.  The  RFLP  analysis  of  Aq  presented  in  this 
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dissertation  is  not  enough  of  a  fine  structural  detecting 
method  to  confirm  those  results,  but  it  is  readily 
apparent  that  the  Aq  alleles  did  not  conform  to  any 
evolutionary  grouping  pattern  as  did  the  Ag  alleles. 
Therefore  the  genetic  polymorphic  generating  mechanisms 
involved  may  have  operated  over  the  relatively  short 
DNA  span  of  Within  the  Ag  gene  itself,  and  perhaps  even 
within  the  intron  between  the  first  two  domain  encoding 
exons,  as  this  is  likely  to  be  the  most  polymorphic 
intron.  This  possibility  is  corroborated  by  the  fact 
that  the  genomic  probe  used  in  this  RFLP  analysis  would 
predominantly  detect  intron  seguence  as  described  in  the 
results. 

In  the  most  closely  related  mouse  strains  of  the 
As  evolutionary  groups,  one  pattern  of  the  A^  alleles 
became  apparent.  For  mouse  strains  which  have  an  AQ  F 
value  of  0.85  or  higher,  a  close  relatedness  of  the  same 
strains  A^  alleles  was  demonstrated.  These  results 
indicate  that  the  Ag  and  A^  genes  show  coevolution  over 
at  least  one  million  years  which  can  be  detected  in 
closely  related  alleles. 

Organization  of  the  t  Haplotypes  into  Ao  Evolutionary 
Groups 

The  t  haplotypes,  as  discussed  in  detail  in  the 

literature  review,  all  appear  to  be  related  to  one 

another,  at  least  on  a  general  structural  level. 

Therefore  it  would  be  reasonable  to  assume  they  have 
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a  very  ancient  common  evolutionary  ancestor.  As  can  be 
seen  in  tables  5  and  6 ,  the  t  haplotypes  are  definitively 
organized  into  the  b  and  d  Ap  evolutionary  groups.  The 
same  patterns  of  identical  F  values  and  closely  related 
F  values  being  found  within  the  same  evolutionary  group 
are  seen.  Table  6  demonstrates  that  the  t  haplotypes  are 
not  an  evolutionarily  isolated  genetic  anomaly  but  have 
undergone  similar  Ap  evolutionary  changes  as  have  the 
other  wild  mice. 

Comparisons  of  class  II  molecules  and  class  II  gene 
RF  genotypes  within  the  A^  and  Ak  family. 

The  grouping  of  some  of  the  Ag  alleles  has  its 
foundation  in  serology  (Wakeland  and  Klein  1979a, 1983) 
and  in  tryptic  peptide  mapping  (Wakeland  and  Darby  1983). 
These  related  alleles  have  been  referred  to  as  the  A_P 
and  Ak  families  in  past  publications  because  of  the 
close  relatedness  of  their  class  II  molecules,  and  to 
avoid  confusion  will  continue  to  be  referred  to  as  such 
here.  It  is  important  to  note,  however,  that  their 
genomic  structure  indicates  that  the  A_P  family  Ap 
alleles  are  integral  members  of  the  evolutionary  Ap 
d  group.  The  Ak  family  shows  a  dramatic  split  in  the 
RFLP  categorization  of  its  Ag  alleles  as  will  be 
discussed  shortly. 

In  the  results  section  of  this  dissertation,  the 
serologic  and  HPLC  tryptic  peptide  fingerprinting  data 
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have  been  described  which  demonstrate  a  gradation  in  the 
structural  similarities  of  A  molecules  encoded  by 
alleles  in  the  AP  family.  The  most  closely-related  or 
"core"  alleles  of  the  I-AP  family,  which  are  AP,  A<3, 
Awl4,  Aw5,  and  Awl1,  encode  molecules  which  probably 
differ  by  only  2  or  3  point  mutations  in  Ag.  The  I-Aw3 
allele  (B10.SAA48)  encodes  a  molecule  which  is 
structurally  similar  to  the  A  molecules  encoded  by  the 
core  alleles,  but  differs  from  them  by  minor  structural 
variations  in  both  subunits.  The  Awl7  allele  (B10.CAS2) 
encodes  a  molecule  which  differs  from  the  rest  of  the 
AP  family  by  numerous  structural  variations  in  both 
subunits  (McConnell  et  al^_  1986).  In  fact,  the  amount  of 
structural  variation  distinguishing  Awl7  from  the  rest 
of  the  I -A  molecules  encoded  by  AP  family  alleles 
indicates  it  is  at  best  a  very  peripheral  member  of  the 
A_P  family. 

Some  interesting  features  about  the  evolutionary 
divergence  of  the  AP  family  are  revealed  when  the  RF 
genotypes  of  the  different  alleles  involved  are  compared. 
The  five  core  alleles  of  the  AP  family  all  have  closely 
related  or  indistinguishable  RF  genotypes  for  A^  and  An, 
indicating  that  these  alleles  have  very  similar  genomic 
structures.  These  results  are  consistent  with  the 
structural  similarity  of  the  A  molecules  they  encode 
and  suggest  that  these  minor  variant  A  alleles  must 
have  recently  diverged  from  a  common  ancestral  allele,  as 
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other  alleles  with  identical  RFLPs.  But  in  the  case 
of  the  A?  family,  a  close  evolutionary  relationship 
between  Aw3  and  the  AP  family  core  alleles  is  not 
established  by  a  comparison  of  their  RFLPs.  These  results 
suggest  that  the  noncoding  region  of  Aw3  may  differ  in 
the  region  which  defines  the  An  evolutionary  groups, 
although  both  genes  encode  molecules  with  similar 
structures.  In  contrast,  Awl7  encodes  an  A  molecule 
with  much  less  similarity  to  AP,  but  appears  more 
related  by  RFLP  analysis.  These  observations  indicate 
that  the  comparison  of  RFLPs  between  two  class  II  alleles 
may  not  always  accurately  predict  the  structural 
similarity  of  their  expressed  gene  products. 

The  Ak  family  alleles  were  previously  divided  into 
two  groups  on  the  basis  of  tryptic  peptide  analysis  and 
radiochemical  sequencing  (Wakeland  and  Darby  1983; 
Wakeland  et  al^_  1985).  The  A  molecules  of  Ak  and 
Aw26  differ  from  those  of  Aw216,  Awl5,  and  Aw34  by 
sequence  variations  affecting  amino  acid  positions  28 
and  95  in  the  3j_  domain  and  by  variations  affecting  two 
adjacent  tryptic  peptides  in  the  a^  domain.  These  five 
alleles  which  are  closely  related  by  protein  analysis 
techniques  are  as  different  as  possible  by  RFLP  analysis, 
and  in  fact  are  in  two  separate  evolutionary  Ao  groups. 
These  results  indicate  that  the  A^  and  Ag  alleles  of 
these  two  subsets  of  what  has  been  called  the  Ak  family 
have  very  unrelated  genomic  structures  despite  the  fact 
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that  the  I-A  molecules  they  encode  differ  from  one 
another  by  only  two  or  three  amino  acid  interchanges.  The 
genetic  mechanisms  which  have  generated  the  noncoding 
differences  between  these  sets  of  alleles  are  unknown, 
but  as  discussed  earlier,  may  reflect  an  intragenic 
recombination  event,  particularly  considering  that  the 
exons  of  these  alleles  are  so  nearly  identical. 

The  Ajj  alleles  in  the  Ak  and  AP  protein  family 
members  followed  an  identical  pattern  as  the  AR  gene. 
Therefore,  especially  with  the  closely  related 
haplotypes,  the  alleles  of  A^  and  Afi  may  be  evolving 
as  an  A^A^  gene  duplex  in  natural  mouse  populations  as 
has  been  previously  published  (Wakeland  and  Darby  1983). 
Although  the  A^  alleles  exhibit  less  RF  variability  then 
alleles  of  Ar  based  on  F  values,  alleles  at  both  loci  in 
these  two  exhibit  the  same  evolutionary  relationships. 
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